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Introduction 


The  monograph  which  follows  comprises  four  essays  which  are 
the  first  fruits  of  the  formal  analytic  aspect  of  our  research 
program  on  (North)  American  braille.  These  four  essays  are  largely 
separate  and  independent  studies,  which  can  be  read  as  defensible, 
integral  investigations  on  their  own.  They  do  not  concatenate  as 
a  series  with  logical  implication  and  essential  ordering  in  their 
structures  as,  say,  four  chapters  of  a  book  might.  Yet  they  grow 
out  of  our  work  in  trying  to  understand  the  composition,  internal 
relations,  and  functional  economy  of  the  braille  code;  they  arose 
as  we  tried  to  approach  and  manage  our  total  task,  and  thus  have  an 
ultimate  cohesion;  they  suggested  themselves  as  problems  requiring 
a  definition  and,  hopefully,  a  solution  or  series  of  resolutions, 
and  they  engaged  our  serious  work  more  or  less  in  the  order  given. 
For  these  reasons  it  seems  clear  to  us  that  they  may  usefully  be 
presented  in  the  order  given. 

We  say  "more  or  less  in  the  order  given"  since  these  problems, 
and  many  others  which  we  hope  to  investigate  and  report  on  in  the 
future,  tended  to  rush  to  mind  simultaneously  or  in  overlapping 
fashion  as  soon  as  we  addressed  our  larger  task  seriously.  It  also 
happens  that  after  one  task  has  been  begun  it  must  be  interrupted 
and  another  unforeseen  problem  solved  in  order  to  bring  the  first 
study  to  an  accurate  conclusion;  there  is  no  need  to  trouble  the 
reader  with  the  details  of  such  necessary  housekeeping  or 
monitoring . 

The  first  study  is  a  mild  updating,  with  no  substantive 
revision,  of  Hamp  &  Caton,  JVIB ,  78,  210-214  (1984).  The  notes 
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which  have  been  added  are  intended  generally  to  amplify  or  clarify 
passages  which  seem  to  have  offered  difficulty  to  readers  and 
associates;  the  original  was  written  to  convey  a  maximum  of  the 
analysis  reported  on  in  non- specialist  language  and  within  brief 
scope  in  a  journal  which  must  husband  space  wisely.  It  is  clear 
that  we  were  not  always  sufficiently  explicit  and  felicitous  in  our 
exposition,  and  it  is  hoped  that  the  present  reprinting  (with 
several  typesetting  corrigenda)  will  make  this  article  more  useful, 
as  well  as  more  available,  to  those  who  may  be  interested  in 
following  the  fate  of  these  braille  studies  further. 

The  purpose  of  the  first  study  was  and  is  to  set  forth  the 
basis  and  principles  of  the  analysis  of  the  code  (Grade  2, 
Literary)  which  was  in  some  form  prerequisite  to  the  design  of  the 
elementary  teaching  materials  which  were  issued  as  Patterns :  The 
Primary  Braille  Reading  Program  (1980).  It  is  a  very  compressed 
account,  with  the  elements  identified  in  the  various  classes 
presented  only  in  selected  illustrative  form.  In  the  near  future, 
when  other  pressing  desiderata  have  been  disposed  of,  it  is 
intended  to  issue  a  more  ample  presentation  and  justification  of 
this  same  terrain. 

The  second  study  followed  the  first  in  written  draft  at  a  much 
later  date  (  1989-1991),  largely  because  in  the  interim  we  were 
occupied  with  the  planning,  writing,  revision,  testing,  and 
production  of  Patterns:  The  Primary  Braille  Spelling  and  English 
Program  (Level  A  1992,  B  1993,  C  and  D  in  press  and  preparation) . 
This  study  announces  a  plan  to  expand  and  refine  the  analysis 
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sketched  in  the  content  of  the  1984  study.  In  order  to  make  the 
intellectual  background  of  our  investigations  and  their  stage  of 
theoretical  advancement  more  readily  grasped  the  development  and 
change  of  20th  century  linguistic  theory  is  compactly  sketched  in 
thumbnail  fashion;  lest  the  compression  reach  caricature 
proportions  some  references  to  recent  general  linguistic 
literature,  which  may  in  turn  lead  to  fuller  references,  have  been 
included. 

For  our  purposes  here,  the  core  of  the  second  study  is  a 
description  of  the  corpus  of  American  English  text  which  forms  the 
data  base  from  which  our  sample  has  been  drawn  for  the  studies  here 
described.  That  corpus  is  the  well  known  and  meticulously 
constructed  "Brown  Corpus".  Its  composition,  genre  constituency, 
and  gross  quantitative  character,  and  these  features  of  our  sample, 
are  set  forth,  as  well  as  our  treatment  of  the  text  sample  for 
entry  in  computer  storage.  The  study  concludes  with  a  sample 
elementary  result  of  processing  this  text  excerpt  for  a  response  to 
a  simple  query:  What  is  the  crude  count  of  braille  units,  classed 
by  the  categories  identified  in  the  first  (1984)  study,  found  in 
the  sample  drawn  from  the  Brown  Corpus? 

We  say  here  "crude  count"  because  we  rapidly  found  that  if  we 
were  ever  to  finish  this  task  in  a  manageable  time  it  would  be 
necessary  to  pool  certain  1984  classes.  Meantime  a  further 
painstaking  study  has  been  conducted  over  the  past  four  years  which 
will  enable  us  to  give  accurate  figures  and  totals  for  all  the 
braille  unit  classes  which  we  originally  desired;  and  a  complete 
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report  on  that  portion,  together  with  a  revision  of  the  end  part  of 
this  second  study,  will  soon  be  available. 

The  third  study,  an  analysis  of  word  lengths,  might  seem  at 
first  blush  to  be  almost  tiresomely  routine  and  boring  as  an 
operation,  although  the  result  would  certainly  be  desirable,  since 
we  really  know  little  about  this  that  has  anything  to  do  with  the 
linguistic  and  cognitive  processing  of  language.  But  it  takes  only 
brief  reflection  from  the  linguistic  point  of  view  to  recall  that 
of  all  linguistic  entities  recognized  in  countless  languages  by  the 
folk  intuition  of  native  speakers  the  notion  of  "word"  is  perhaps 
the  most  elusive:  When  you  hear  language  spoken  there  is  often  no 
stable  acoustic  mark  to  match  audibly  the  space  you  read  on  paper 
or  between  braille  shapes.  Small  elements,  such  as  in  hit 'em  hard, 
cling  to  others  so  that  it  becomes  difficult  to  say  what  criteria 
distinguish  an  affix  from  a  clitic  (an  attached  but  independent,  if 
reduced,  word) .  Of  course,  if  one  seeks  semantic  criteria,  there 
are  numerous  difficulties  in  establishing  to  what  degree  a  word 
should  have  an  isolable  meaning  (How  close  to  the  encyclopedic 
individuality  of  'horse'  is  the  meaning  of  of  or  that?),  whether 
names,  e.g.  Jacob,  truly  have  meaning,  whether  two  word-like  forms 
sharing  a  meaning  are  one  or  two  (e.g.  prime  minister,  scarlet 
fever,  chaise  longue,  >  lounge  vs.  input).  These,  and  many  other 
riddles  or  just  complexities  of  reasoning,  are  problems  that  worry 
linguists  (who  must  account  in  detail  also  for  spoken  language) 
when  they  are  called  on  to  define  and  isolate  the  notion  "word" , 
which  seems  so  basic  to  native  speakers. 
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But  additionally  we  find  when  we  confront  written  text  that 
there  are  further  difficulties  in  recognizing  word  units  with 
confidence.  Even  if  we  limit  ourselves  to  a  simplistic  approach 
and  accept  as  words  whatever  appears  between  spaces  (e.g.  counting 
a  hyphen  as  not  a  space) ,  there  are  still  conundrums  in  store  for 
us:  What  do  we  do  with  numerals,  especially  large  ones?  Are  they 
words  made  of  numbers,  or  phrases  composed  of  numerical  words?  How 
are  abbreviations  and  acronyms  to  be  regarded?  Are  mathematical 
formulas  and  formalized  notations  a  part  of  normal  text?  Do 
ellipses  exist  or  are  they  really  their  parent  full  forms?  What 
about  odd  shapes,  such  as  Greek  letters  or  printer's  signs?  What 
is  the  rational  way  to  count  hyphenated  locutions?  We  have  already 
mentioned  above  the  linguistic  aspect  of  difficulty  with  proper 
names;  but  there  is  the  further  problem  of  their  complexity  (e.g. 
New  York,  San  Diego,  Baile  Atha  Cliath,  John  Smith,  Elias  Tate  III , 
Harry  S  Truman,  Constitution  Hall,  La  Scala,  The  Reverend  Robert 
Walker,  and  the  punctuated  phrasal  The  Atchison,  Topeka  and  Santa 
Fe)  and  their  many  idiosyncratic  categories  ( Gone  with  the  Wind, 
She  Stoops  to  Conquer,  Xerox)  .  Allied  to  the  last  is  the  problem 
whether  the  highly  divergent  class  of  foreign  words  is  usefully, 
for  our  purpose,  English. 

Obviously,  these  problematic  inter-space  spans  have  for  us  an 
interest  which  is  marginal  or  segregable.  Our  primary  interest 
must  be  to  characterize  the  English  that  constitutes  the  bulk, 
core,  and  working  inventory  of  essential  written  discourse  of  all 
kinds.  Therefore,  after  pondering  the  categories  that  seem  to 
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appear  in  our  sample  corpus,  we  elected  to  extract  these  in 
separate  lists  and  to  reserve  them  for  separate  treatment;  we  have 
already  made  extensive  progress  with  the  last  mentioned,  and  intend 
to  report  on  that  at  a  later  date.  At  first  we  feared  that  in 
following  this  up  we  might  find  ourselves  consuming  valuable  time 
on  a  tiny  corpus  of  trivial  curiosities;  as  will  be  seen  on  a  later 
occasion,  this  corpus-fraction  is  by  no  means  small  or  negligible, 
nor  is  it  without  rich  theoretical  and  practical  interest. 

Having,  then,  set  aside  these  deviant  categories,  we  have 
reached  some  interesting  results  on  the  length  of  English  braille 
words  in  relation  to  inkprint,  which  we  present  herewith.  It 
should  be  recalled  and  emphasized  that  meaningful  results  of  this 
sort  cannot  be  attained  without  a  prior  analysis  that  identifies 
basic  primes  of  braille  comparable  to  our  braille  units;  hence  the 
essentiality  of  the  analysis  dealt  with  in  the  first  (1984)  study 
for  all  these  succeeding  studies. 

The  final  study  in  this  volume,  one  of  a  number  of 
methodologically  analogous  studies  already  included  in  our  work 
plans,  deals  with  a  feature,  or  syndrome  characteristic,  of  braille 
structure  which  is  detailed  in  nature  but  salient  in  its  formal 
character  and  important  to  a  serious  consideration  of  braille  in 
its  social  setting:  the  incidence  of  certain  sequences  (of 
inkprint)  that  by  rule  do  and  do  not  require  the  braille  units 
known  as  lower-signs.  This  study,  it  is  believed,  illustrates 
sharply  the  clarity  and  precision  which  immediately  emerges  with 
the  application  of  criteria  afforded  by  a  set  of  primes  such  as 
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those  discussed  in  the  1984  study.  On  the  other  hand,  since  this 
general  point  has  already  been  suggested,  and  since  an  appreciation 
of  the  highly  different  behavior  of  these  nine  sequences  or 
braille-configurations  and  their  actual  use  will  emerge  only  by 
minimally  studying  our  tabulations,  our  criteria  for 
classification,  and  our  commentary,  there  is  no  advantage  served  by 
further  introductory  remarks  on  this  study:  Those  familiar  with 
the  braille  code  will  already  recognize  the  importance  and 
interest--or  perhaps  the  idiosyncratic  annoyance — of  the  "lower- 
sign  words".  Teachers  of  braille  are  familiar  with  the  special 
effort  that  these  handy  but  quirky  items  in  the  tool  kit  impose. 

When  we  started  on  some  of  these  problems  over  fifteen  years 
ago,  without  access  to  the  kinds  of  computer-assisted  aids  to  the 
processing  of  text  data  which  we  command  today1,  we  were  already 
rapidly  aware  of  the  pressing  need  for  precise  study  of  a  myriad 
aspects  of  braille  structure,  function,  and  use  for  the  prosecution 
of  every  phase  of  the  tasks  of  instruction,  for  the  teaching  of 
young  and  old.  We  could  not  have  foreseen  the  gratifying 
possibility  that  this  work  might  come  to  be  immediately  placed  in 
relation  to  fruitful  discussions,  which  now  actively  make  welcome 
progress,  leading  to  a  collaborative  Unified  Code. 

EPH 

15  April  1994 


:But  we  must  never  forget  that  the  computer  cannot  substitute 
for  old-fashioned  thinking  and  curiosity,  perhaps  with  the  help  of 
a  brailler  or  pencil. 


A  Fresh  Look  at  the  Sign  System 
of  the  Braille  Code 

Eric  P .  Hamp 
Hilda  Caton 

American  Printing  House  for  the  Blind 

1984 

(corrected  and  lightly  revised  1993) 
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Abstract 

As  one  of  the  steps  in  the  development  of  Patterns :  The  Primary 
Braille  Reading  Program,  an  analysis  of  American  English  braille  as 
a  written  code  was  undertaken  from  a  linguistic  viewpoint.  The 
object  was  to  view  and  analyze  the  braille  code  internally  and  not 
as  an  encipherment  of  printed  language;  thus  we  view  such  arrays  of 
dots  as  (brl)  as  a  single  meaningful  unit  for  a  young  blind 
learner,  on  a  par  with  (and)  or  g,  and  not  as  a  "contraction"  of 
some  full  form  as  yet  unknown  and  forever  unseen  in  its 
presupposedly  expanded  shape.  Such  an  analysis  would  be 
appropriate  for  purposes  of  instructing  those  who  have  never 
mastered  or  even  seen  print  designed  for  the  sighted.  The  process 
of  analysis  was  conducted  so  as  to  yield  the  primes  which  manifest 
the  structure  of  braille  and  the  relations  which  characterize  them. 
The  fundamental  relation  exploited  in  order  to  identify  these 
primes  was  the  "sign  relation"  as  notably  expounded  in  linguistics 
by  Ferdinand  de  Saussure.  The  analysis  led  to  a  regrouping, 
identification,  and  fresh  description  of  the  many  elements  of 
American  English  literary  braille.  Grade  2,  with  the  intention  to 
facilitate  both  the  literary  teaching  and  the  learning  of  reading 
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A  Fresh  Look  at  the  Sign  System 
of  the  Braille  Code 


There  is  no  dearth  of  documentation  to  support  the  assertion 
that  much  of  the  difficulty  young  visually  handicapped  children 
encounter  in  learning  to  read  can  be  ascribed  to  the  fact  that  the 
materials  used  have  been  transcribed  into  braille  from  print 
editions .  The  problems  made  by  this  practice  have  been  studied  arid 
described  in  detail  (cf.  Ashcroft,  1960;  Bleiberg,  1970;  Caton, 
1979;  Lowenf eld,  Abel,  &  Hatlen,  1969;  Nolan  &  Kederis,  1969). 

In  an  attempt  to  overcome  some  of  these  problems ,  a  primary 
braille  reading  program  has  been  developed  at  the  American  Printing 
House  for  the  Blind.  Development  of  this  program.  Patterns :  The 
Primary  Braille  Reading  Program  (Caton,  Pester,  St  Bradley,  1980), 
began  with  an  analysis  of  all  available  research  related  to  the 
braille  code,  including  the  studies  cited  above. 

However,  as  this  analysis  was  conducted,  it  became  apparent 
that  the  information  gained  was  not  adequate  for  the  development  of 
the  most  effective  materials  for  teaching  or  learning  braille 
reading.  One  major  problem  was  that  the  existing  categories,  or 
groupings,  of  the  various  elements  of  the  code  were  confusing  to 
both  teachers  and  students — primarily  because  the  descriptive 
language  used  was  inadequate.  For  example,  terms  like 
"contraction,"  "sign,"  and  "symbol,"  are  used  interchangeably  to 
describe  whole-word  contractions,  part -word  contractions,  short 
forms,  alphabet  words,  lower-cell  words,  upper-cell  words,  dot-five 
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words,  whole-cell  part  words,  etc.  Many  other  examples  of  the 
confusion  caused  by  existing  groupings  and  descriptions  of  the 
braille  code  could  be  presented  if  space  permitted;1  but  the  point 
is  that  it  was  not  possible  to  describe  the  braille  code  clearly  to 
the  students  who  would  use  it  to  learn  to  read.  Another  problem 
was  that  no  teaching  material  existed  that  reflected  the  internal 
characteristics  of  the  braille  code,  and  teachers  were  forced  to 
use  materials  reflecting  the  characteristics  of  print.  That  is, 
all  sequencing  of  vocabulary,  reading  skills,  and  teaching 
strategies  was  based  entirely  on  the  principles  of  print  reading. 

For  these  reasons,  it  was  necessary  to  conduct  an  internal 
analysis  of  the  braille  code  that  would  result  in  a  regrouping  and 
fresh  description  of  its  various  elements,  thus  facilitating  both 
the  teaching  and  the  learning  of  braille  reading.  The  new 
descriptions  and  groupings  allowed  teachers  to  discuss  and  describe 
the  various  elements  of  the  code  in  a  clear,  precise  manner,  so 
that  their  students  were  not  confused.  In  addition,  the  analysis 
clarified  the  internal  characteristics  of  the  code,  so  that 
teachers  could  devise  more  effective  teaching  strategies. 

The  authors  wish  to  emphasize  that  the  information  provided 
through  the  research  mentioned  above  is  relevant  and  significant, 
and,  in  fact,  was  used  to  a  large  extent  in  sequencing  the 
vocabulary  and  skills  in  Patterns :  The  Primary  Braille  Reading 
Program.  We  maintain,  however,  that  work  before  Patterns  did  not 
sufficiently  recognize  the  truly  specific  characteristics  of  the 
braille  code;  and  that  the  internal  analysis  discussed  in  this 
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paper  does  so.2 

In  order  to  present  our  results  in  a  clear  and  concise  way ,  we 
begin  by  describing  the  total  process  involved  in  the  development 
of  the  reading  program;  first  discussing  the  problems  addressed  in 
the  design  of  the  course  material  and  in  the  internal  analysis 
itself . 

The  entire  study  was  undertaken  from  a  linguistic  viewpoint. 
The  authors  wish  to  emphasize  that  the  analysis  and  explication  df 
braille  throughout  the  paper  is  done  without  appeal  to  the 
characteristics  and  setting  of  sighted  print — an  intrusion 
irrelevant  to  our  analysis  and  too  often  invoked.3 
Toward  An  Internal  Analysis 

It  has  long  been  recognized  that  a  major  problem  in  the 
teaching  of  braille  reading  arises  from  differences  between  the 
representations  of  natural  spoken  language  in  print  and  in  the 
braille  code.  These  differences  have  tended  to  be  depicted  in 
terms  of  additions  or  subtractions;  that  is,  people  have  been 
occupied  with  contractions  as  being  somehow  shorter  by  subtraction 
from  the  full  print  form,  and  the  like.  Attention  has  also  been 
focused  on  the  particularities  and  tactile  difficulties  of  shapes 
of  the  dot  configurations  within  cells.  This  is  clearly  important 
in  itself,  yet  has  little  to  do  with  the  functions  which  have  been 
allotted  to  various  character  groups. 

Less  obvious  is  the  fundamental  difference  in  internal 
structure  between  many  aspects  and  subsections  of  the  braille  code 
and  those  of  a  full  font  of  type.  A  further  important  point  to  be 
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stressed  is  that  of  the  nature  and  status  of  relations,  in  an 
abstract  sense,  between  braille  and  print.  That  is  to  say,  there 
are  clear  and  partly  simple  relations  that  hold  between  print  and 
the  braille  code--for,  after  all,  the  braille  code  was  devised  on 
the  basis  of  print,  or  writing;  but  frequently  the  learner  of 
braille  is  not  already  in  possession  of  a  knowledge  of  print  or 
writing . 

Problem  Areas 

After  reflecting  on  such  questions,  we  have  arrived  at  the 
conclusion  that  two  large  and  essential  problems  confront  us  at 
this  time: 

1.  The  present  lack  of  a  thorough  and  relatively  abstract 
internal  analysis  of  the  braille  code  stands  in  the  way  of  our 
ability  to  confront  effectively  and  confidently  many  of  the  major 
decisions  involved  in  designing  adequate  learning  materials. 
Although  we  do  not  minimize  the  high  intellectual  fascination  that 
such  a  formal  analysis  holds,  we  have  been  impelled  to  proceed  by 
practical  concerns;  the  following  analysis  reflects  this,  and  the 
ordering  of  elements  is,  to  a  considerable  extent,  influenced  by 
our  sense  of  practicality  in  teaching  and  in  course  design. 

2.  In  scrutinizing  materials  and  practices  employed  to  date 
in  teaching  braille,  and  even  those  explanatory  materials  that 
authoritatively  present  the  braille  ( English  Braille  American 
Edition  1959),  we  are  impressed  by  the  prevailing  tendency  to 
analyze  the  code  and  explicate  problems  via  elements,  processes, 
assumptions,  and  customs  based  on  visual  experience.  Even  though 


6 


all  concerned  are  well  aware  of  the  situation  of  the  blind  student, 
and  particularly  of  a  young  person  who  has  never  had  the 
opportunity  to  be  exposed  to  print,  much  of  the  discussion  and 
explanation  is  carried  on  as  if  all  members  of  the  dialogue  had  a 
degree  of  experience  with  the  rudiments  of  printed  or  written 
English.  So,  for  example,  in  materials  of  discussion  directed  to 
a  blind  student,  the  braille  unit  that  conveys  the  dental  spirant 
sound  at  the  beginning  of  a  word  such  as  thin  is  usually  talked 
about  as  though  it  were  obvious  that  it  should  be  spelled  in 
conventional  English  with  the  letters  t  and  h.  Of  course,  no 
teacher  of  braille  thinks  of  the  word  thin  as  starting  out  with  the 
two  sounds  t  and  h ;  and  by  mentioning  this  letter  combination  to  a 
person  who  has  never  seen  it,  the  teacher  can  give  an  initial 
impression  only  of  irrelevancy  at  best. 

We  propose  to  deal  with  these  problem  areas  by  presenting,  as 
compactly  as  possible,  the  results  of  a  fresh  and  completely 
internal  analysis  of  the  braille  code. 

In  this  analysis  the  characteristics  of  visual  print  are  never 
used  as  internal  elements  of  the  system  to  be  analyzed.  Rather, 
the  features  of  visual  print  are  segregated  as  a  separate  system 
external  to  the  braille  code;  that  is,  as  a  set  of  elements  with 
which  the  braille  elements  may  be  contrasted,  and  for  which 
relations  may  later  be  defined  between  the  two.  Our  analysis  both 
characterizes  the  internal  braille  code  and  states  the  relations 
between  braille  and  print.  We  shall  return  shortly  to  a  more 
precise  statement  of  how  this  mode  of  analysis  is  to  be  conducted. 
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Major  Steps  in  Ordering 

To  keep  clearly  in  view  our  ultimate  practical  aim  in  light  of 
the  problems  discerned  above,  we  propose  that  the  goals  for  the 
devising  of  teaching  materials  should  follow  these  major  steps: 

1.  A  relatively  complete  internal  analysis  of  the  braille 
code.  We  say  "relatively  complete"  in  order  to  isolate  major 
issues;  to  see  in  which  direction  one  must  move;  and,  finally,  so 
as  not  to  overload  the  analysis  and  presentation  with  endless 
detail  or  rarely  occurring,  nonessential  elements.  We  restrict  the 
analysis  to  the  generally  used  major  part  of  literary  braille,  and 
content  ourselves  with  giving  as  examples  only  representative  or 
crucially  interesting  instances  for  most  of  the  classes  discussed. 
A  fairly  complete  list  of  the  resulting  categorization  is  given  in 
Box  1 . 

2.  Design  of  the  most  direct  and  reasoned  route  toward 
mastery  of  the  elements  and  their  combinations,  and  of  a  compact 
and  maximally  simple  notation  to  carry  out  such  discussion.  This 
means  that  all  teaching  materials  were  ordered  for  the  presentation 
of  all  elements  for  which  a  principled  ordering  can  be  determined. 
This  is  not  simply  a  question  of  leaving  nothing  to  chance;  it  is 
a  matter  of  proceeding  logically  from  the  known  to  the  less  known. 

3.  Introduction  of  visual  elements,  including  print  and 
writing.  (As  mentioned  earlier,  these  elements  must  be  clearly 
discriminated  for  the  purposes  of  the  analysis;  we  are  not  here 
concerned  with  the  problem  of  the  correct  introduction  of  these 
visual  elements  as  a  goal  of  the  learning  and  teaching  process.) 
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4.  A  careful  consideration  to  be  given,  finally,  to  the 
correct  phasing  and  insertion  of  the  activities  specified  under 
numbers  2  and  3:  in  short,  the  effective  teaching  of  braille  and 
the  appropriate  teaching  of  print.  (Again,  this  task  falls  beyond 
our  present  purpose.) 

As  we  have  just  observed,  numbers  3  and  4  fall  outside  the 
scope  of  the  present  study.  Number  2  depends  crucially  on  number 
1,  and  the  publication  Patterns :  The  Primary  Braille  Reading 
Program  represents  the  first  essay  toward  fulfilling  the  objectives 
of  number  2.  The  following  discussion,  therefore,  addresses  itself 
to  the  objectives  of  number  1  only. 

Braille  and  Print  Writing:  Their  Mutual  Relation 

If  no  English  speaker  used  any  mode  of  representation  for 
English  other  than  braille,  the  analysis  of  braille  would  not 
differ  essentially  from  graphic  analyses  of  one  or  another  writing 
system  now  or  formerly  in  use  in  the  world.  Of  course  details 
would  differ.  The  writing  and  printing  of  English  is  essentially 
an  alphabetic  enterprise.  The  mixed  representation  that  braille 
employs  consists  of  symbols  for  letters,  syllables,  and  whole 
words.  Consider  these  examples:  the  letter  h  in  the  braille  word 
boy ;  the  syllable  (er)  in  the  braille  word  exercise ;  and  the  whole 
braille  word  (people) .  An  entire  world  of  English  speakers  using 
only  braille  would  present  us  with  the  straightforward  task  of 
correlating  the  braille  system,  internally  formulated,  with  the 
grammatical  and  semantic  facts  of  the  English  language.  Thus  this 
job  would  be  analogous  to,  and  only  more  complex  than,  that  of 
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correlating  conventional  English  spelling  with  the  way  our  words 
are  put  together.  But  braille  users  do  not  constitute  the  entire 
English  speaking  population;  nor  is  it  expected  that  those  who  use 
braille  will  fail  to  master  conventional  print  English  also. 

This  fact  imposes  a  separate  dimension  on  the  analysis,  with 
a  corresponding  task  for  analyst  and  teacher;  a  dimension  arising 
from  two  principal  facts  of  language  use  that  have  impinged  on  the 
character  and  function  of  braille. 

1.  Historically,  of  course,  braille  is  not  independent  of 
print  English.  It  has  been  devised  over  time,  with  print  English 
preceding  chronologically  and  lurking  in  the  background  as  a 
partial  model. 

2.  Braille  will,  moreover,  be  learned  by  students  who  wish  to 
convert  with  maximum  efficiency  that  knowledge  and  skill  into  a 
competence  with  English  print,  so  that  they  can  type  and  of  course 
use  a  computer. 

If  our  analysis  of  braille  is  to  take  these  two  aspects  into 
consideration,  the  resulting  formulation  of  the  relations  of  the 
internal  braille  system  with  the  structure  of  print  English  will  be 
complex  in  a  special  way;  that  is,  it  will  distinguish  more  classes 
of  elements  and  functions  than  if  only  the  braille  system  or  only 
the  structure  of  print  were  being  considered. 

At  every  stage  of  this  analytical  process,  we  must  ask 
ourselves : 

1.  What  forms  are  distinguished  by  what  elements  and  what 
combinations  in  terms  of  braille  shapes  and  braille  units. 
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2.  How  each  of  these  forms  is  correlated  with,  that  is, 
equivalent  to,  not  equivalent  to,  or  partially  equivalent  to, 
elements  and  features  and  combinations  of  print  English. 

We  see  immediately  from  this  that  there  are  two  important 
aspects  to  every  determination  that  we  make  for  a  braille  element: 
the  configuration  of  the  element  itself  (so  and  so  many  cells 
consisting  of  such  and  such  arrangements  of  dots),  and  the  element 
or  elements  of  print  English  with  which  this  braille  shape(s)  is 
found  to  be  correlated. 

In  the  theory  of  signs  as  elaborated  in  the  technical 
literature  of  linguistics,  such  a  relation  is  known  as  a  "sign 
relation,"  and  the  combination  of,  or  more  technically,  the  linkage 
between  the  signifier  (the  braille  configuration  in  this  case)  and 
the  signified  (the  print  English)  is  known  as  a  sign.*  To  invoke 
an  analogue,  this  value,  or  function,  stands  in  relation  to  the 
signifier  much  as  a  meaning  does  in  relation  to  its  signifier  which 
we  call  a  word.  Thus,  what  we  are  calling  value  here  is  much  akin 
to  the  notion  of  meaning ;  but,  for  the  present,  we  shall  not  use 
the  latter  term  in  this  sense,  lest  we  cause  confusion  with  a 
different  type  of  sign  relation;  namely,  that  of  linguistic 
semantics,  with  which  we  are  not  here  concerned. 

This  consistent  correlation  of  print  values  with  the 
distinguished  braille  elements  may  be  called  for  convenience  a 
value  system,  or  sign  system,  in  the  sense  just  specified.  It  is 
in  such  an  analysis,  as  well  as  in  the  details  of  its  execution, 
that  our  approach  differs  fundamentally  and  to  the  greatest  extent 
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from  earlier  formulations  and  presentations  of  the  braille  code. 

The  following  internal  analysis  presents  a  relatively  detailed 
description  of  this  consistent  correlation  of  print  values  with  the 
particular  braille  elements  corresponding  to  them.  The 
presentation  consists  of  a  set  of  new  terms  and  the  regrouping  of 
all  elements  of  English  braille,  Grade  2,  as  well  as  the  new 
descriptions  of  these  elements  previously  discussed. 

The  Internal  Analysis:  A  New  Approach  to  the  Braille  Code 

To  begin  our  internal  analysis,  it  is  necessary  first  to 
distinguish  the  following  primes  of  braille.  For  clarity,  we  have 
restricted  the  meaning  of  each  prime  to  a  single  sense. 

Cell.  This  term  has  been  used  in  more  than  one  sense  in  the 
literature.  Here  cell  is  defined  as  an  abstract  space  twice  as 
high  as  it  is  wide,  in  which  there  are  six  positions  arranged  in 
three  rows  and  two  columns,  in  which  dots  may  appear. 

Shape.  We  defined  as  a  shape  a  single  configuration  made  up 
of  one  to  six  dots  and  occupying  a  single  cell.  Note  that,  so  long 
as  a  shape  is  defined  in  this  fashion,  it  does  not  yet  have  any 
necessary  value  or  meaning,  and  therefore  is  not  a  sign  in  the 
sense  used  in  the  sign  theory  alluded  to  above. 

Dot.  A  dot  is  defined  as  the  element  of  which  shapes  in  a 
cell  are  composed.  The  dots  of  a  braille  shape  occur  physically  as 
bumps  or  bosses. 

Braille  unit.  We  now  require  a  term  for  any  shape  or  shapes 
taken  together  in  correlation  with  its  (or  their)  value  or  meaning. 
It  would  be  convenient  and  nonoffensive  to  ordinary  English  if  the 
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word  character  could  be  used  for  this  concept,  especially  since 
that  is  the  established  English  term  for  the  element  of  Chinese 
writing  that  closely  resembles  in  function  this  braille  element. 
But  unfortunately  this  term  has  already  been  employed  for  some  time 
by  users  of  braille  in  a  deceptively  similar  but  ambiguous  sense; 
and  so,  unless  users  of  braille  wish  to  alter  their  terminology,  it 
will  be  necessary  to  find  some  other  term  for  this  pivotal  concept 
in  the  present  analysis.  For  this  purpose  the  term  braille  unit 
has  been  proposed  by  us.5 

A  single  braille  unit  may  consist  of  one  or  more  shapes;  as, 
for  example,  (go)  =  1  shape,  ( tion )  =  2  shapes,  etc.  Braille  units 
fall  into  three  major  types — letters,  modulations,  and  grams. 

1.  Letters.  Letters  are  either  alphabetic, 
such  as  b,  or  nonalphabetic ,  such  as  2. 

a.  Alphabetic  letters  (or  letters  proper)  have  print- 
alphabetic  values;  that  is  to  say,  these  braille  units  match 
occurrences  of  ordinary  letters  in  print.  For  example,  the  shape 

•  • 

( * )  has  the  value  of  i . 

b.  Nonalphabetic  letters  comprise  (1)  numbers,  the 
decimal  point,  and  the  fraction  bar,  and  (2)  certain  other  braille 
units  with  abstract  letterlike  segmental  function,  such  as  the 
asterisk  and  the  apostrophe.  The  numbers,  of  course,  take  the 
number  sign,  and  then  the  shapes  with  the  resulting  number  value 
may  be  thought  of  as  letters  of  a  numerical  alphabet  consisting  of 
twelve  shapes  (counting  the  decimal  and  the  fraction  bar)  and 
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spelling  number  words.  The  asterisk  may  be  viewed  as  a  rather  odd 
unpronounceable  letter,  and  the  apostrophe  often  has  the  implied 
meaning  "letter  left  out."  The  reason  for  classing  these  two  as  if 
they  were  letters  or  numbers  is  that,  like  conventional  letters, 
they  have  sequential  segmental  properties  in  linear  order,  and  the 
characters  they  match  in  the  print  text  consist  regularly  of  single 
segmental  print  shapes. 

2.  Modulations.  Modulations  are  of  two  rather  different 
sorts:  punctuation,  as  for  example,  the  question  mark;  and 
register,  as  for  example,  italics.  What  these  have  in  common  is 
that  they  "do  things"  to,  that  is,  have  effects  on,  other  elements 
--the  segmental  elements--in  the  chain. 

a.  Punctuation ,  in  fact,  has  print  values  that  are, 
themselves,  sequential  in  position;  but  these  braille  units  differ 
from  others  in  having  domains  of  effect  extending  at  times  to 
considerable  distances  to  the  right  and  left  of  their  sequential 
position.  Some  punctuation  looks  back;  examples  are  the  period  and 
the  non-Spanish  exclamation  point.  Other  punctuation  encloses; 
examples  are  the  hyphen  and  the  dash.  Those  that  look  back  have  as 
a  domain  of  their  force  what  has  gone  before;  those  that  enclose 
both  warn  us  of  their  application  and  close  their  domain;  those 
that  link  affect  things  on  both  sides. 

b.  Register  is  the  term  applied  to  those  braille  units 
that  include  what  have  traditionally  been  called  composition  signs ; 
these  units  look  forward,  and  may  also  automatically  specify  where 
the  scope  or  domain  terminates.  Examples  are  capital,  italic, 
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letter  and  number  signs.  These  elements  always  have  the  effect  of 
modifying  the  basic  segmental  values  of  what  follows;  thus,  they 
change  the  dress  of  some  elements,  such  as  lower  case  into 
capitals,  or  change  what  we  think  of  as  type  style,  such  as  italic, 
or  change  letters  into  numbers,  or  change  the  abstract  reading  of 
an  element,  such  as  the  letter  sign.  Registers  have  the  unique 
property  of  finding  no  separate  segmental  counterpart  in  print. 

3.  Grams.  These  are  of  three  kinds:  phonograms,  such  as  the 
(ance)  in  dance ;  morphograms,  such  as  the  (ance)  in  reliance ;  and 
logograms,  such  as  the  words  (rather) ,  (the),  (friend) ,  (mother), 
and  (immediate) .  The  distinction  between  grams  and  the  two 
preceding  units  is  that,  unlike  modulations,  they  are  segmental  in 
value;  but,  unlike  letters,  they  have  no  single  counterparts  in  a 
type  font.  Because  of  this  last  property,  these  are  the  braille 
units  that  later  will  give  rise  to  bidirectional  problems  in 
writing  and  spelling. 

a.  A  phonogram  is  a  braille  unit  having  a  phonetic  value 
that  would  be  written  in  print  by  more  than  one  alphabetic  symbol. 
Phonograms  are  either  one-shaped,  like  (th) ,  (ch) ,  (gh) ,  the  (ing) 
in  sing,  the  (ea)  in  read,  the  (ed)  in  bed,  and  the  (ar)  in  bar,  or 
multishaped  like  the  (ation)  in  nation,  the  (ound)  in  sound,  the 
(ong)  in  long,  the  (ence)  in  fence,  the  (ity)  in  pity,  the  (ness) 
in  Tennessee ,  and  the  (less)  in  bless. 

b.  A  morphogram  is  a  braille  unit  having  the  value  of  an 
element  in  a  word,  such  as  an  inflectional  ending,  prefix,  or 
suffix.  Examples  are  the  s  in  words,  the  (ing)  in  looking,  the 
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(ed)  in  looked,  the  (ance)  in  avoidance ,  the  ( ation )  in  admiration , 
and  the  (in)  in  inconsistent.  Note  that  the  shape(s)  that  make  up 
(ing) ,  (ed) ,  (ance),  etc.  appear  as  phonograms  or  morphograms , 
depending  on  their  function  in  words,  that  is,  their  "value." 

c.  A  logogram  is  a  braille  unit  made  up  of  one  or  more 
shapes  having  the  value  of  an  English  word  (conventionally,  a  chain 
of  letters  between  spaces)  with  either  limited  reflection  or  no 
reflection  at  all  of  phonetic  values.  There  are  two  principal 
configurations  of  logograms,  single-shape  and  multishape.  Single 
shapes  comprise  letter  words  and  wordlets.  A  letter  word  is  a 
logogram  that  has  a  shape  the  same  as  that  of  a  letter.  Examples 

are  (but),  (can),  (do),  (rather) .  Wordlets  comprise  all  other 

logograms  carrying  a  word  value.  Examples  are  (and),  (the), 

(shall),  (still),  (there),  (ought),  (young),  (those),  (enough) , 

(cannot) ,  (paid),  (declare) ,  (was),  (to).  It  should  be  observed 
that  logograms  do  not  lose  their  status  as  such  when  they  undergo 
derivation  by  affixes.  Thus  (spirit)  remains  a  logogram,  in  this 
case  a  compound-letter  word  (let),  when  it  occurs  as  part  of  the 
derived  adjective  spiritual .  Such  a  definitional  provision  avoids 
the  need  for  pedantically  encumbering  the  analysis  by  renaming 
hosts  of  wordlets  "morphograms"  because  they  may  form  parts  of  long 
derived  words.6 
Summary 

As  we  have  stated,  the  internal  analysis  as  described  in  this 
paper  was  used  for  regrouping  the  various  elements  of  the  braille 
code  and  devising  new  descriptions  of  these  elements.  This  was 
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done  to  eliminate  the  many  conflicting  and  confusing  terms  and 
categories  previously  used  by  teachers  and  students,  and  to  provide 
them  with  a  new  system  consisting  of  a  relatively  small  number  of 
categories,  or  groups,  and  clear,  precise,  linguistically  based 
descriptions  of  all  the  elements  of  English  braille.  Grade  2. 

In  addition,  we  have  presented  a  discussion  of  the  braille 
code,  based  on  linguistic  principles,  which  is  intended  to  clarify 
the  internal  characteristics  of  the  code.  Specifically,  it  is 
intended  to  dispel  the  notion  that  the  braille  code  and  the 
principles  of  braille  reading  are  in  a  way  analogous  to  the  print 
code  and  the  principles  of  print  reading. 

We  believe  that  the  results  of  this  analysis  can  be  used  to: 

1.  Provide  teachers  with  an  outline  of  braille  terms  that 
will  enable  them  to  describe  and  discuss  any  element  of  the  braille 
code  in  a  manner  easily  understood  by  children. 

2 .  Emphasize  that  the  teaching  of  braille  reading  and  print 
reading  are  not  analogous,  and  promote  an  understanding  of  the 
internal  characteristics  of  the  braille  code,  thus  providing 
teachers  with  more  effective  strategies  for  teaching  reading  to 
children  who  use  braille  as  their  primary  medium. 
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Notes 

JIn  fact,  the  analysis  here  presented  resulted  directly  from  the 
efforts  on  the  part  of  the  first-named  author  to  learn  the  braille 
code  from  scratch  by  using  the  handbook  (EBAE,  1959  )  and  from  his 
need  to  clarify  and  restate  the  formulations  and  expositions  set 
forth  in  that  influential  work. 

2We  claim  therefore  that  earlier  work  was  not  merely  inefficient 
and  inadequate  in  its  obligation  to  the  blind  learner,  but  that  it 
was  non-descriptive  of  braille. 

30f  course,  because  both  braille  and  the  inkprint  some  of  you  are 
reading  are  equally  representations  of  the  same  language  and 
because  the  designers  of  braille  patterned  it  on  inkprint  it  is  not 
accidental  that  the  two  have  much  in  common.  We  therefore  do  not 
ignore  inkprint;  quite  the  reverse.  The  issue  is  rather  where  we 
must  seek  our  analytic  criteria,  and  to  what  relations  and  status 
we  allot  all  elements. 

4The  authorative  source  on  this  concept  is  [de]  Saussure  (1916). 
5Brunit  was  considered  but  did  not  find  favor. 

SA  similar  example  to  spiritual  would  be  worldly .  Some  readers  may 
wonder  about  a  specimen  such  as  {some) .  Would  the  sequence  (some) 
in  lonesome  be  similarly  a  wordlet?  The  answer  is  no.  The 
analysis  will  have  already  isolated  and  identified  the  suffix 
(some)  in  {lonesome) ,  {handsome) ,  etc.  (even  though  this  ancient 
and  poorly  productive  English  suffix  is  not  very  frequent  or 
salient  today) ;  thus  the  braille  unit  (some)  here  is  a  morphogram. 
It  is  not  the  same  as  the  logogram  (some)  .  The  fact  that  these  two 
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have  radically  different  meanings  is,  however,  a  linguistic  fact, 
but  not  directly  a  feature  of  the  status  of  these  braille  units  in 

the  code. 
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Table  1 


outline  of  Braille  Terms — Examples 


1.  Letters 

a.  Alphabetic  letters  (26) 

b.  Nonalphabetic  letters  (12) 

(1)  0-9 
decimal  point 
fraction  bar 

(2)  other  braille  units  with  abstract  letter like  function 
accent  gign 

apostrophe 

asterisk 

ellipsis 


hyphen  or 

dash — when  used  to 

indicate 

missing  letters 

Grams 

a.  Phonograms 

(ally) 

Sally 

(ed) 

red 

(ance) 

dance 

(en) 

pen 

(and) 

sand 

(ence) 

fence 

(ar) 

car 

(er) 

certain 

(ation) 

nation 

( ever) 

several 

(bb) 

rubber 

(ff) 

duffle 

(ble) 

table 

(for) 

forest 

(CC) 

occur 

(ful) 

awful 

(ch) 

chair 

(99) 

suggest 

( com) 

come 

(9h) 

ghost 

(con) 

contrary 

(here) 

adhere 

(dd) 

paddle 

(in) 

pin 

(dis) 

dispel 

(ing) 

sing 

(ea) 

read 

(ity) 

city 
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Morphograms 

(after) 

afterlife 

(ful) 

wonderful 

(ally) 

mathematically 

(here) 

cohere 

( ance ) 

avoidance 

(in) 

indecent 

(and) 

multiplicand 

( ing ) 

singing 

(ar) 

secular 

Uty) 

rarity 

( ation ) 

admiration 

(less) 

useless 

(be) 

befriend 

(ment ) 

ornament , 

monument 

( com) 

commiserate 

(ness) 

openness , 

oneness 

(con) 

confuse 

(sion) 

aversion. 

confusion 

(dia) 

disengage 

( some ) 

loathsome 

(ed) 

rubbed 

(through)  throughout,  throughway 

(en) 

encephalogram 

(tion) 

reaction. 

prediction 

( ence) 

providence 

(th) 

seventh 

(er) 

zipper 

( there) 

therefore 

Logogram 

( 1 )  Letter  ' 

word 

(but ) 

(knowledge) 

( that ) 

( can ) 

(like) 

(US) 

(do) 

(more) 

(very) 

( every ) 

(not) 

(will ) 

( from ) 

(people) 

(it) 

(9°) 

(quite) 

(you) 

(have) 

(rather) 

(as) 

(just) 

(SO) 
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( 2 )  wordlet 


(about ) 

(always) 

( cannot ) 

( ever) 

(know) 

( above ) 

( and) 

( character) 

( father) 

(herself) 

(according) 

(be) 

(child) 

(first ) 

(himself) 

( across ) 

(because) 

( children) 

(for) 

(itself) 

(after) 

( before ) 

( conceive) 

(friend) 

( thyself ) 

( afternoon ) 

(behind) 

(conceiving) 

(good) 

(myself) 

(afterward) 

(below) 

( could) 

( great ) 

(yourself) 

(again) 

(beneath) 

(day) 

(had) 

(oneself) 

(against ) 

( beside ) 

(deceive) 

(here) 

(ourselves ) 

(almost ) 

( between ) 

(deceiving) 

(him) 

(thetaselves) 

(already) 

(beyond) 

(declare) 

(his) 

(yourselves) 

(also) 

(blind) 

(declaring) 

(immediate) 

(although) 

(braille) 

(either) 

(in) 

(altogether) 

(by) 

(enough) 

(its) 

3 .  Modulations 

a .  Punctuation 

(1)  look  back 

colon  period 

comma  question  mark 

exclamation  point  semicolon 

(2)  enclose 

bracket  or  brace  (in  pairs) 
comma  (in  pairs) 
parenthesis  (in  pairs) 
quotation  marks,  single  (in  pairs) 
quotation  marks ,  double  ( in  pairs ) 

( 3 )  link 

bar  long  dash 

hyphen  bracket  or  brace  (one) 


dash 


b.  Register 


letter  sign 
number  sign 
termination  sign 
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capital  sign,  single 
capital  sign,  double 
italic  sign,  single 
italic  sign,  double 
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Abstract 

A  plan  is  broached  to  continue  and  refine  the  analysis  outlined  in 
Hamp  and  Caton,  1984  .  This  plan  has  been  pursued  since  1989.  The 
conceptual  foundations  of  the  analysis  are  placed  in  the  context  of 
20th  century  linguistic  theory  in  brief  and  summary  form.  To 
pursue  such  a  plan  the  first  requisite  is  a  suitable  text  corpus  to 
serve  as  an  empirical  data  base.  The  creation  of  such  a  corpus  is 
described.  As  a  sample  result  we  offer  classed  totals  of  braille 
units  contained  in  our  text  sample. 
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Background 

As  one  of  the  components  employed  in  the  development  of 
Patterns :  The  Primary  Braille  Reading  Program,  an  analysis  of 
American  Literary  Braille,  Grade  2,  was  performed  drawing  on  the 
model  of  linguistic  analysis.  Results  of  that  analysis  were  used 
in  designing  materials  and  recommendations  to  serve  the  teacher  of 
young  children  who  will  use  braille  as  their  learning  medium.  That 
analysis  provided  teachers  with  a  means  of  identifying  the  objects 
of  study,  of  simplifying  explanations  of  some  of  the  unique 
characteristics  of  the  braille  code,  and  also  provided  a  basis  for 
ordering  the  presentation  of  braille  units  to  young  children.  A 
summary  and  non-technical  report  of  this  analysis  was  published  in 
the  Journal  of  Visual  Impairment  &  Blindness  (Hamp  and  Caton,  May 
1984),  and  is  here  reissued  in  the  preceding  pages.  That  report 
itself  is  brief  and  was  intended  only  to  indicate  the  general 
directions  taken.  The  analysis  proved  to  be  useful  for  the  stated 
purposes,  but  was  never  considered  by  the  authors  to  be  exhaustive 
or  complete.  For  this  reason,  the  purpose  of  the  research  program 
outlined  in  the  following  pages  will  be  to  refine  and  expand  the 
initial  analysis.  Before  some  directions  of  the  refinement  and 
expansion  are  summarized,  however,  it  is  necessary  to  present  as 
background  a  brief  historically  arranged  summary  of  the  main  lines 
of  development  of  linguistic  theories — specifically  theories  of 
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grammar — which  directly  affect  this  braille  analysis.  The 
presentation  which  follows  is  greatly  simplified  and  compressed. 
It  is  intended  only  to  indicate  and  call  to  mind  in  summary  the 
rapid  movement  which  the  field  has  witnessed  since  World  War  I. 
For  relatively  accessible  introductory  accounts  and  for  references 
toward  further  reading  one  may  consult  Crane,  Yeager,  &  Whitman 
(1981),  Akmajian,  Demers,  (Farmer),  &  Harnish  (1979,  1990),  Finegan 
&  Besnier  (1989),  O'Grady  &  Dobrovolsky  (1989),  and  always  with 
profit  Lyons  (1968).  As  a  mode  of  approach,  the  following  question 
is  posed,  and  aspects  of  the  answers  are  given: 

Where  did  linguistic  grammatical  theorizing  and  analysis  of 
the  braille  code  stand  at  the  beginning  of  the  1990s? 

A.  We  will  set  aside  the  historical  and  genetic  comparative 
linguistic  theorizing  of  the  19th  century  (and  later  up 
to  the  present  day)  as  not  being  relevant  to  braille 
problems  at  all.  On  this  subject  see  Pedersen  (1924, 
1931),  Morpurgo  Davies  (1991).  Between  the  1920s  and 
the  1950s  a  group  of  theories  of  grammar  was  proposed 
which  is  often  called  "structural."  An  important  aspect 
of  that  point  of  view  was  the  insistence  on  isolating, 
identifying,  and  characterizing  fundamental  entities 
separately  and  specifically  relevant  to  the  structure  of 
each  different  language.  Different  subvarieties  of  such 
theorizing  arrived  at,  or  emphasized,  differing  criteria, 
and  advocated  different  solutions  of  detail;  but  the  need 
to  recognize  relevant  and  principled  basic  entities  was 
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generally  agreed.  The  entities  recognized  by  traditional 
grammar - -words ,  roots,  affixes,  sentences,  etc . --had  been 
customarily  defined  or  specified  by  inconsistent, 
overlapping,  or  non-empirical  criteria.  The  entities  now 
recognized  usually  comprised  superficial  components  that 
occur  as  sequences  one  after  the  other  to  make  up  audible 
spoken  or  readable  written  sentences  of  language;  they 
were  preferably  not  abstract  objects  or  relations  or 
categories  that  fail  to  surface  as  observable  segments  or 
spans  in  such  sequences.  An  important  shortcoming  of 
this  brand  of  theorizing  was  the  visible  inability  to 
incorporate  such  analytic  formulations  in  a  satisfactory 
view  of  the  function  and  variation  of  such  structures  in 
the  use  of  human  language.  Theorizing  since  the  1950s 
has  concentrated  on  repairing  that  shortcoming  in  our 
perception  of  the  goals  of  grammar. 

B.  A  major  revision  in  grammatical  theory  came  towards  the 
end  of  the  1950s  with  what  has  since  been  generally 
called  "transformational  grammar."  One  important 
consequence  of  this  theoretical  view  is  the  realization 
that  a  correct  recognition  of  the  full  functional  (or 
"semantic"  or  perhaps  "semiotic")  properties  of  a  grammar 
will  not  necessarily  preserve  at  every  stage  of  analysis 
all  isolable  grammatical  entities  in  a  constant  state  of 
relevance  to  the  analyst,  nor  in  the  same  apparent 
categorization.  In  other  words,  just  because  you  can 


identify  an  element  analytically,  this  does  not  mean  that 
every  such  element  is  equally  pertinent,  or  pertinent  to 
the  same  degree,  to  all  aspects  or  phases  of  the  grammar. 
So  also  the  following  six  equivalent  sentences: 

John  made  the  cabinet. 

John  was  the  maker  of  the  cabinet. 

John  was  the  cabinet's  maker. 

It  was  John  who  made  the  cabinet, 
and  perhaps 

The  cabinet  was  made  by  John. 

The  making  of  the  cabinet  was  John's, 
recast  the  elements  into  quite  different  syntactic  roles, 
while  scarcely  altering  the  semantic  values.  What 
changes  here  is  mainly  the  stylistic,  discourse, 
pragmatic,  register,  or  elocutionary  value.  But  this 
realization  and  a  successful  formulation  of  the  rules  for 
constructing  such  equivalences  do  not  eliminate  the  need 
to  have  agreed  and  well  founded  ways  for  recognizing  and 
segregating  perceived  elements. 

Earlier  work  on  transformational  grammar  (for  which  the 
term  "generative"  has  often  been  used)  during  the  1960s 
introduced  many  changes,  both  of  detail  and  of  substance, 
in  the  theory,  and  even  led  to  a  rejection  of  the  view 
concerning  which  cognitive  phenomena  are  basic  to  a 
grammar  (e.g.,  the  doctrines  called  "case  grammar"  and 
"generative  semantics").  During  the  1970s  and  1980s  a 
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sizeable  number  of  competing  or  variant  theories  of 
grammar  has  additionally  been  developed,  presented  and 
debated.  These  include  those  known  under  the  names  or 
doctrines  of  "extended  standard  theory,"  "trace  theory," 
"pragmatics,"  "discourse  analysis,"  "functional  grammar," 
"relational  grammar,"  "Montague  grammar,"  "arc-pair 
grammar,"  "space  grammar,"  "government-binding," 
"generalized  phrase  structure,"  "natural  phonology," 
"metrical  phonology ,  "  "  autosegmental  phonology,"  "lexical 
morphology,"  "template  morphology,"  not  that  we  attempt 
to  exhaust  the  list. 

D.  It  will  be  seen  that  there  are  at  present  among  linguists 
many  different  points  of  view  as  to  precisely  how  an 
optimal  grammar  is  internally  constituted.  All  of  this 
is  really  quite  independent  of  the  fact  that  modern 
linguists  actually  agree  on  a  surprising  number  of  basic 
features,  aspects,  properties,  functions,  and  phenomena 
that  can  be  observed  and  stated  for  any  human  language 
thus  far  investigated  with  care.  All  of  this  question  of 
internal  formulation  is  independent  of  the  different 
emphases  and  vantage  points  from  which  language  may  be 
viewed  by  scholars  of  particular  domains;  such  scholars 
are  singled  out  or  categorized  as  sociolinguists, 
cognitive  scientists,  students  of  perception  and 
phonetics,  dialectologists ,  comparative  and  historical 
linguists,  theorists  of  spoken  versus  written  language, 


on  which  latter  see  Senner  (  1989),  not  to  mention  those 
specialists  known  as  anthropologists,  ordinary  language 
philosophers,  logicians,  specialists  in  artificial, 
computer,  and  restricted  languages,  semanticists , 
students  of  data  retrieval  and  artificial  intelligence — 
all  specialists  in  some  obvious  disciplines  that  exploit 
or  impinge  on  linguistic  analysis  and  theories  which  may 
be  called  grammatical. 

The  goal  of  grammarians  in  the  1940s,  1950s,  and  1960s 
was  to  write  a  complete  grammar  for  every  language.  With 
the  later  discussions  of  grammatical  theory  and  the 
resulting  realization  of  the  great  complexity  of 
language,  taken  in  its  most  inclusive  sense,  that  goal  as 
a  literal  possibility  has  been  largely  given  up  in  any 
short  term  understanding  of  a  reasonable  target. 
Instead,  today,  grammarians  mainly  write  or  frame 
paradigmatic  discussions  of  interesting,  crucial,  or 
boundary-defining  areas  of  the  arguments  with  the  hope  of 
arriving  at  an  incisive  definition  of  the  problem;  and  at 
a  pertinent  and  explanatory  formulation  of  a  regularity, 
or  rule  of  grammar,  or  of  the  coverage  and  function  of  a 
portion  of  the  assumed  grammar. 

It  is  therefore  seen  that  the  modern  state  of  linguistics 
does  not  claim  to  discover  definitive  grammars  in  a  final 
way.  Rather,  it  seeks  to  work  towards  the  refinement  of 
grammatical  questions,  at  the  same  time  clarifying 
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grammatical  theory. 

G.  It  is  therefore  at  present  the  case  that  formal 
grammatical  descriptions  are  presented  to  touch  upon  the 
total  range  of  a  language  only  in  the  case  of  languages 
and  dialects  which  have  never  really  been  investigated  at 
close  range  in  a  modern,  or  20th  century  linguistic  mode; 
and  the  model  for  presentation  in  such  cases  is  generally 
that  of  the  sort  of  observation  and  analysis  referred  to 
under  A  above,  with  excerpts  from  B  or  C .  At  some  later 
time,  or  even  by  anticipation  along  with  this  first 
analysis,  an  analytic  study  in  one  of  the  frameworks  of 
B,  C,  and  D  above  may  also  be  carried  out.  The  natural 
languages  which  are  now  analyzed  in  this  way  tend  to  be 
those  many  languages  of  the  world  spoken  by  smaller 
populations  and  often  called  "indigenous." 

In  such  a  framework  many  concerned  linguists  today 
conduct  salvage  work  on  endangered  languages  which 
tragically  find  themselves  on  the  brink  of  extinction. 
Such  studies  are  quite  separate  from  the  topic  of 
investigation  which  has  been  dubbed  "language  death"  or 
"obsolescence . " 

H.  In  preparation  for  the  development  of  Patterns :  The 

Primary  Braille  Reading  Program,  an  analysis  of  braille 
was  carried  out  by  Hamp  essentially  within  the 
theoretical  framework  outlined  in  A  above,  or  an  analog 
thereof.  In  this  sense  the  presently  attained  analysis 


of  braille  is  at  the  theoretical  level  of  linguistics  in 
the  1950s,  just  a  third  of  a  century  out  of  date. 

The  analysis  of  braille  was  not  carried  to  any  of  the 
further  above  inodes  at  that  time  because: 

1.  our  manpower  and  available  time  were  limited; 

2.  it  was  essential  to  keep  matters  simple  and 
manageable  and  not  to  become  engrossed  in  recent 
theoretical  debate,  so  as  not  to  lose  our  goal  from 
view; 

3.  it  seemed  possible  to  reach  a  useful  result  with 
that  mode's  investment  of  theory; 

4.  it  was  not  certain,  for  a  start,  that  a  profession 
anchored  in  tradition  would  rapidly  accept  the 
radical  changes  entailed  even  by  our  modest 
proposal  shorn  of  forbidding  complexities  drawn 
from  technical  linguistic  discourse; 

5  .  it  was  and  is  not  clear  that  certain  given  modes  of 
more  probing  analysis  will  equally  give  immediate 
yield  for  our  grasp  of  braille  as  a  limited  object. 
6 .  of  course  it  will  be  ultimately  important  to 
understand  the  properties  of  braille  in  their  full 
extent  and  context  as  a  symbolic  system,  and  we 
hope  to  participate  in  the  exploration  of  this  as 
it  unfolds  in  due  time. 

From  the  reception  accorded  our  first  set  of  efforts,  we 
are  encouraged  to  believe  that  further  development  in 
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both  the  adaptation  of  theory  and  the  accomplishment  of 
analysis  will  lead  to  formulations  and  descriptions  of 
portions  and  aspects  of  braille  writing  which  will  in 
some  measure  mirror  the  positive  developments  of  theory 
and  analysis  reflected  for  the  field  of  linguistics  in 
points  B,  C,  and  D  above.  For  the  time  being,  because  of 
the  priority  of  duties  in  our  profession  which  demand 
productive  results  all  along  the  way,  we  defer 
exploration  of  the  rather  more  theoretical  of  these  aims. 
Our  present  tasks  are  much  more  like  those  of  a 
composition  class  in  a  village  school  than  of  a  syllabus 
for  a  university  linguistics  course.  But  we  insist  that 
in  our  analytic  duties  we  must  keep  the  later  goals  and 
lesson  in  view. 

K.  In  the  meantime,  we  perceive  a  pressing  set  of  tasks 
which  should  be  addressed.  These  tasks  remain 

theoretically  within  the  framework  of  points  A  and  B 
above;  they  are  designed  to  build  upon,  consolidate, 
refine,  and  further  select  for  concentrated  application 
the  admittedly  rather  crude  results  that  were  obtained  in 
our  first  analysis  and  presented  for  nontechnical  reading 
in  the  outline  sketch  (Hamp  and  Caton,  1984)  which  is 
reproduced  above.  Those  results,  however,  have  proved 
fundamental  and  essential,  if  rudimentary,  to  our  further 
work  in  the  interim. 
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Toward  a  Refinement 

It  is  clear  that  for  the  empirical  study  of  English  braille  as 
used  in  North  America  an  acceptable  corpus  of  text  must  be  selected 
and  appropriately  prepared.  This  task  has  therefore  been  our  first 
priority  over  the  years  1989-1993.  Though  some  results  have  been 
rapidly  obtained,  we  still  have  elements  of  detail  to  polish  and 
report . 

Procedure 

The  text  materials  used  for  the  present  analysis  are  25 
samples  chosen  from  the  corpus  which  forms  the  basis  of  the 
publication,  Computational  analysis  of  present-day  American  English 
(KuCera,  H.,  &  Francis,  W.  N.,  1967),  generally  known  as  the  "Brown 
Corpus."  This  corpus  comprises  1,014,312  words  of  prose  and 
consists  of  500  samples  of  about  2,000  words  each,  taken  from 
contemporary  publications  in  American  English.  The  samples  include 
scientific  and  learned  writing  as  well  as  fiction,  journalistic 
prose,  and  other  genres;  the  character  of  the  samples  was  carefully 
controlled  and  explicitly  specified.  A  meticulous  characterization 
and  listing  are  to  be  found  in  Kufiera  and  Francis  1967  and  in  their 
manual  (revised  and  amplified  1979).  The  25  samples  used  in  the 
present  analysis  were  selected  so  that  the  text  materials  may  be 
representative  of  all  types  of  literature  included  in  the  "Brown 
Corpus."  The  proportionality  and  inventory  of  the  samples  drawn 
from  the  Brown  Corpus  are  as  follows,  with  genre  indicated  in  each 
instance . 
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Number  of 

texts  in 

corpora 

Genre 

Brown 

Ours 

Our  Samples 

Informative : 

A. 

Press:  Reportage 

44 

2 

Al, 

A21 

B. 

Press:  Editorial 

27 

1 

B1 

C. 

Press:  Reviews 

17 

1 

Cl 

D. 

Religion 

17 

1 

Dl 

E. 

Skills  and  Hobbies 

36 

2 

El, 

E21  ’ 

F. 

Popular  Lore 

48 

2 

FI, 

F21 

G. 

Belles  Letters, 

Biography,  etc. 

75 

4 

Gl, 

G21, 

G41 

,  G61 

H. 

Miscellaneous 

30 

1  1/2-2* 

Kl, 

K21 

J. 

Learned  &  Scientific 

80 

4 

Jl, 

J21, 

J41 

,  J6 

Imaqinative : 

K. 

Fiction:  General 

29 

1  1/2-2 

Kl, 

K21 

L. 

Fiction:  Mystery 

and  Detective 

24 

1 

LI 

M. 

Fiction:  Science 

6 

0 

N. 

Fiction:  Adventure 

and  Western 

29 

1  1/2-1 

Nl 

P. 

Fiction:  Romance,  Love 

29 

1  1/2-1 

Pi 

R. 

Humor 

9 

1/2-1 

R1 

Total  500  25 


♦These  are  rounded  at  risk  of  bias. 
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The  English  texts  which  were  selected  to  be  used  as  a  basis 
for  the  analysis  were  copied  from  the  full  "Brown  Corpus,"  and  a 
new  file  was  created  containing  25  samples  selected  from  those  500 
texts  which  constitute  the  corpus. 

Data  of  the  full  text  had  been  scanned  to  identify  the  visual 
character  set  and  symbolism  used  in  the  version  of  the  text  which 
had  been  received. 

The  text  of  the  file  of  selected  samples  was  edited  for 
translation  into  braille.  Some  of  the  steps  in  editing  were: 
distinguishing  between  opening  and  closing  quotes  where  the  quote 
character  ( " )  appeared;  distinguishing  single  quotes  from 
apostrophes,  both  indicated  by  (')  in  the  text;  marking  the 
headings  for  appropriate  braille  representation;  indicating  which 
letters  standing  alone  required  letter  signs;  editing  for  italics; 
and  identifying  foreign  words  and  acronyms  in  which  braille 
contractions  should  not  be  used. 

After  editing,  the  samples  were  translated  into  braille  and 
proofread  by  a  braillist. 

In  the  output  from  the  braille  translation  program,  each 
braille  cell  is  represented  by  a  four-digit  number.  Part  of  this 
number  indicates  the  dots  which  form  the  shape.  This  part  of  the 
number  provides  the  data  for  embossing.  The  other  part  of  the 
number  serves  to  distinguish  between  two  cells  (often  referred  to 
as  "signs")  with  different  meanings  regardless  of  their  possible 
identity  in  shape.  For  instance,  (gg)  and  (were)  have  code  numbers 
of  which  three  digits  are  the  same  and  one  is  different.  This 
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unique  identification  is  used  for  checking  the  accuracy  of  the 
program. 

The  four-digit  cell  numbers  were  grouped  to  correspond  to 
braille  units.  This  resulted  in  a  data  set  in  which  (not) ,  (ness), 
and  (neither)  ,  as  examples,  were  represented  by  four-,  eight-,  and 
twelve-digit  numbers  respectively. 

A  counting  program  was  developed  having  a  table  of  the  braille 
units.  Each  entry  contained  the  numeric  value  of  the  unit,  the 
category  of  the  unit,  an  inkprint  equivalent,  and  a  unit  name.  The 
names  were  limited  to  four  characters  using  an  abbreviated  form 
suggested  by  the  braille  in  some  cases,  as  Roman  "mch"  in 
representing  (much),  and  arbitrary  abbreviations  in  others,  as 
Roman  "whch"  for  (which).  Thus  it  should  be  clearly  understood 
that  these  representations  are  purely  arbitrary  tags  of 
convenience . 

The  program  reads  the  file  of  braille  unit  numbers  and  prints 
rows  of  dots  representing  the  braille  cells,  a  line  beneath  the 
dots  showing  the  inkprint  equivalent,  and,  between  these,  a  line 
marking  the  beginning  of  each  braille  unit. 
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For  each  line  of  braille  text,  there  follows  a  count  of  the 
number  of  units  in  each  of  the  following  categories: 

ALPH  alphabetic  letters 
NONA  non-alphabetic  letters  (numbers, 
decimal  point) 

OTHR  non-alphabetic  letters  (asterisk, 
apostrophe ) 

GRAM  phonogram,  morphogram,  logogram 
PUNC  punctuation  (modulation) 

RGST  register  (modulation) 

Note:  These  present  categories  closely  approach  the  inventory 

discerned  in  Hamp  &  Caton,  1984.  As  our  work 
progresses  we  intend  to  expand  these  categorial 
discriminations  by  basing  the  ongoing  analysis  on  ever 
more  refined  observations. 

Detailed  definitions  of  the  categories  can  be  found  in  the 
teachers  edition  of  each  level  of  Patterns :  The  Primary  Braille 
Reading  Program.  The  braille  units  in  a  line  are  listed 
individually  by  name  and  number  of  occurrences.  See  figure  1  for 
an  illustrative  reproduction  of  the  above  combined  data.  The 
program  also  prints  a  count  of  braille  units  individually  and  by 
category  for  each  sample  and  for  the  complete  file;  these  data  will 
be  discussed  below. 

Now  that  we  have  our  sample  corpus  in  an  appropriate  state  for 
computer  processing  in  both  braille  and  print  versions,  and  already 
with  some  facilitating  analytic  categorization  correlated  in 
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accessible  form,  it  is  immediately  apparent  that  a  great  wealth  of 
information  lies  amply  awaiting  imaginative  consultation  or  else 
can  be  readily  developed  with  the  investment  of  only  a  modest 
amount  of  further  data  insertion  and  analysis.  Indeed,  we  already 
have  at  hand  a  shopping  list  of  embarrassing  proportions.  Along 
these  lines,  we  will  welcome  inquiries  so  far  as  we  are  in  a 
position  to  satisfy  them. 

As  a  minuscule  sample  of  the  information  already  in  hand  we 
offer  here  (Figure  2)  a  tabulation  of  the  totals  of  braille  units 
occurring  in  the  entire  set  of  25  texts  drawn  from  the  Brown 
Corpus . 
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Even  a  casual  glance  at  the  totals  for  the  25  texts  reveals  a 
dramatic  fact.  It  is  known  that  the  frequencies  for  whole  English 
words  in  the  Brown  Corpus  show  at  the  lower  end  a  long  trail  off 
with  many  single  occurrences.  Out  of  37851  dictionary  words  (i.e., 
forms  which  would  appear  as  headwords  in  a  dictionary)  which  make 
up  the  Brown  Corpus  2124  account  for  80%  of  the  Corpus  text,  while 
as  many  as  22000,  or  58%  of  the  list,  occur  but  once  each.  Note 
that  the  pattern  for  least  frequent  braille  units  is  entirely 
different . 

We  propose  to  continue  in  the  near  future  with  studies  of  this 
corpus  sample  more  penetrating  and  less  obvious,  or  superficial,  in 
character  and  more  promising  for  the  solution  of  problems  that 
previously  a  sharp  eye  and  pencil  could  scarcely  touch. 
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Abstract 

Lengths  of  braille  words  were  studied  as  part  of  the  linguistic 
analysis  of  Grade  2  Braille.  Base  data  consisted  of  Brown  Corpus 
selections,  which  have  been  described  in  the  preceding  study.  In 
recognition  of  the  known  complexity  yet  finite  nature  of  the  notion 
"word"  some  of  the  text  items  were  excluded  in  order  to  yield  an 
indisputable  body  of  "normal"  units  for  the  study.  Excluded,  or 
"Selected,"  text  segments  comprise:  1.  items  of  arbitrary  length 
and  structure  such  as  numerals  and  acronyms;  2.  aberrant  items, 
including  hyphenated  complexes  and  proper  names,  which  raise 
problems  of  criteria;  and  3.  debatable  items  such  as 
abbreviations,  print  contractions,  foreign  words,  titles,  and  day 
and  month  names.  There  remained  45408  running  text  words  from 
which  punctuation  and  composition  signs  were  removed  as  fortuitous 
or  irrelevant.  Each  word  was  assigned  a  length  consisting  of  three 
numbers  in  the  format  C,N,S,  giving  the  counts  of  print  characters, 
braille  units,  and  braille  shapes.  For  example,  the  length  of 
tension  is  7,3,4.  Various  tabulations  were  made  of  these  lengths. 
While  there  is  a  preponderance  of  short  words  in  both  print  and 
braille,  this  feature  is  more  marked  in  braille.  More  than  one- 
third  of  the  words  in  braille  text  consist  of  only  one  shape. 
While  43  percent  of  print  words  have  three  or  fewer  letters,  in 
braille  63  percent  have  three  or  fewer  units.  Certain  length 
combinations  occur  very  freguently.  More  than  19  percent  of  all 
words  have  the  length  3,1,1.  Five  lengths  (3,1,1  2,1,1  2,2,2 
4,1,1  and  4,4,4)  account  for  40  percent  of  the  words.  Words  which 
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have  no  braille  contractions,  i.  e.,  where  the  counts  of  C,  N,  and 
S  are  equal,  constitute  ca.  one-fourth  of  the  text.  In  more  than 
one-half  of  the  word  lengths,  one  finds  one  or  two  fewer  shapes  in 
braille  than  characters  in  print.  Tables  of  interrelation  among 
the  counts  of  print  characters,  braille  units,  and  braille  shapes 
confirm  our  intuition  that  while  there  is  general  correspondence 
between  the  lengths  of  words  in  print  and  braille,  deviation  arises 
noticeably  out  of  the  existence  in  braille  of  units  (contractions) 
which  may  stand  for  anywhere  from  two  to  ten  letters.  On  the  other 
hand,  between  braille  unit  counts  and  braille  shape  counts  a  close 
parallel  prevails.  Word  length  information  is  important  for 
braille  education  and  for  an  understanding  of  the  characteristics 
of  the  braille  communication  medium  and  its  genres. 
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An  Analysis  of  Braille  Word  Lengths 

A  study  of  the  lengths  of  words  found  in  running  American 
English  braille  text  has  been  conducted  as  part  of  the  project  "A 
Linguistic  Analysis  of  Grade  2  Braille." 

Data  for  the  study  comprised  a  braille  version  of  selections 
from  the  Brown  Corpus.  This  material,  used  also  for  other  analyses 
in  the  project,  consists  of  25  sample  texts  of  equal  length 
representative  of  the  range  of  subjects  and  styles  in  the  Brown 
Corpus  literature;  cf.  no.  II  of  this  collection. 

The  notion  "word"  may  seem  at  first  blush  to  be  rather  simple 
and  obvious:  perhaps  what  occupies  the  line  between  two  spaces  on 
a  page.  What  then  do  we  do  with  sequences  such  as  pre-  and 
postmodernism  or  brother-  and  sister-in-law ?  If  we  isolate  pre-  we 
not  only  end  up  with  a  "word"  that  offends  our  sense  of  notional 
integrity,  but  we  embarrass  our  analytic  criteria  by  extracting  a 
"word"  without  a  "base",  without  its  core.  With  brother-,  even 
worse,  we  allege  an  erroneous  kin  relation.  Let  us  then  reverse 
our  stance  and  register  the  full  part  which  was  elided;  but  this  is 
surely  false,  since  the  elision  was  deliberate  and  its  purpose  was 
to  avoid  unwanted  repetition.  Besides,  such  a  procedure  tampers 
with  the  text;  we  then  have  no  easy  or  objective  way  of  forbidding 
the  gratuitous  supply  of  all  sorts  of  addenda  which  a  reader  might 
feel  to  be  explicatory  or  even  as  clarifying  and  more  felicitous 
for  the  style.  Our  job  is  to  count  what  is  there,  to  have  explicit 
rules  for  recognizing  what  is  there  and  for  keeping  track  of  what 
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we  do  not  count . 

There  are  other  ways  in  which  the  search  for  a  satisfactory 
definition  of  the  notion  "word"  encounters  difficulty.  Are  both 
story  and  stories  instances  of  the  same  word:  are  sleep  and  slept, 
or  feed  and  fed,  the  same  word;  or  go  and  went ?  If  we  agree  that 
story  and  storey  are  different  words,  what  about  dialog  and 
dialogue?  Is  (to  be)  heading  the  same  word  as  (to)  head,  or  head 
(home)  the  same  as  head  (the  class)?  Is  heading  (the  class)  an 
instance  of  (a  newspaper)  heading?  Is  it  useful  to  say  that 
headstrong  CONTAINS  the  word  head,  or  that  timeliness  CONTAINS  time 
and  timely?  That  is,  can  words  be  inside  words?  If  so,  what  about 
heady  or  hearty? 

There  are  even  more  subtle  problems  in  deciding  what  a  "word" 
is  and  what  we  should  regard  as  its  extent.  It  is  easy  to  see  that 
the  verb  forms  leans,  leaning,  and  leaned  have  a  part  in  common. 
It  is  a  bit  less  clear-cut,  if  we  take  pronunciation  into  account, 
to  claim  that  means,  meaning,  and  meant  (all  taken  as  forms  of  the 
verb)  share  just  the  same  corresponding  part.  We  see  also  that  we 
can  identify  a  common  part  in  story  and  stories  successfully  if  we 
adopt  a  simple  rule  that  says  that  we  must  write  i  as  y  when  it 
comes  before  a  space.  Let  us  consider  now  the  simple  set  four, 
fourteen,  and  forty.  If  we  can  solve  the  problem  mentioned  above 
suggesting  that  words  may  contain  words,  then  we  find  no 
predicament  in  saying  that  fourteen  actually  CONTAINS  four  (4+10 
with  ten  in  a  special  shape,  the  vowel  with  two  e's),  and  that 
forty  has  4  and  10  in  special  shapes  when  the  meaning  of  the 
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combination  is  'multiply'.  In  other  words,  we  will  say  that  these 
derivative  "words"  are  formed  by  simple  adjoining.  By  stretching 
our  criteria  we  can  even  account  in  a  similar  fashion  for  some  of 
our  verb  forms:  We  might  say  of  the  pair  leans  and  lean  that  the 
former  has  an  -s  for  the  third  person  (she,  he  or  it)  and  that  the 
latter  then  has  lean  as  the  skeleton  or  essence  of  'leaning'  or 
'tilting'  to  convey  the  activity  or  state  for  all  unspecified 
persons.  In  this  way  we  have  arrived  at  the  common  part  of  leans, 
leaning,  leaned,  and  lean — a  kind  of  abstract  entity  LEAN  which 
denotes  a  certain  verbal  function  stripped  of  its  inflections. 

Yet  things  do  not  always  work  out  so  conveniently.  If  we 
consider  another  simple  pair,  peach  and  peaches,  we  cannot  extract 
peach  from  peach-es  in  the  same  way  without  doing  violence  to  the 
meaning.  To  put  it  simply,  a  peach  is  singular,  but  peach-es  is 
not  singular  and  plural.  This  is  not  to  say  that  we  cannot  contrive 
to  solve  this  riddle  somehow;  and  if  we  remember  a  peach  tree 
(never  a  peaches  tree);  and  notice  with  Peachtree  Street  in  Atlanta 
how  we  waver  with  compounds)  we  are  led  to  the  essence,  or  abstract 
entity,  of  the  noun  PEACH.  But  we  are  still  left  with  the  task  of 
stating  just  how  peach  denotes  both  PEACH  and  singular.  An 
analogous  problem  presents  itself  in  the  case  of  meat,  a  mass  noun 
(some  meat,  not  a  meat),  and  meats  'kinds  of  meat'.  This  is,  of 
course,  the  problem  that  underlies  a  portion  of  the  construction  of 
a  dictionary  entry. 

We  see  then  that  a  "word"  is  a  complex  notion.  It  has  both  a 
shape  (a  sound,  or  a  written  or  brailled,  configuration)  and  a 
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meaning  or  signified  value.  There  are  at  least  two  kinds  of 
annoyance  that  arise  for  us  as  we  try  to  identify,  delimit, 
measure,  and  count  words  in  a  written  or  brailled  text.  There  is 
the  question  of  what  happens  to  be  found  standing  between  spaces 
and  how  deviant  these  spaces  may  be;  and  there  is  the  problem  of 
defining  identity  and  function  among  partials  within  such  non- 
deviant  spans. 

There  exists  a  very  large  serious  literature  touching  on  the 
notion  "word",  too  large  for  us  to  pretend  to  survey  or  even  list 
here.  Perhaps  as  a  seeming  irony,  there  is  no  entry  word  in  the 
10-volume  Encyclopedia  of  Language  and  Linguistics  (1994;  volume  9, 
page  4989  );  it  seems  that  the  very  complexity  of  the  notion  led  the 
editors  to  an  immediate  partition  of  the  subject,  or  to  eliminate 
it  as  a  proper  rubric.  Thus  we  find  the  entry  word,  phonological 
(ibid.  1994:  5007-9),  where  it  is  stated  that  orthographic  spaces 
beg  the  question,  that  this  unit  is  the  smallest  piece  which  can  be 
pronounced  alone  with  a  stable  meaning,  and  that  "small  words" 
(e.g.  was  or  for),  with  their  varying  pronunciation,  create  a 
problem.  In  short,  all  the  discussion  here  dwells  on  the  spoken 
language  and  dismisses  the  graphic  as  problematic.  Otherwise,  we 
find  in  this  major  encyclopedia  only  word  recognition  and  lexical 
access  (5009-14),  word-formation:  compounding  (5021-6;  cf .  Katamba 
1993:  297),  word-formation:  neo-classical  combinations  (5026-8; 
a  puzzling  category),  and  word-formation:  shortening  (5029-31);  a 
surprisingly  short  inventory.  Yet  in  the  computer-produced  subject 
index  to  this  Encyclopedia  1/94  or  ca .  1%,  of  the  listing  of  all 
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the  non-obvious  questions  regarding  language  is  captioned  word  or 
a  phrase  including  the  expression  word.  On  the  other  hand,  much  of 
interest  to  the  notion  "word"  is  to  be  found  in  the  article  clitics 
(vol.  2,  pp.  571-6).  As  a  complex  technical  example  of  an  attempt 
to  arrive  at  an  exact  delineation  of  a  phonological  word  using  the 
criteria  of  clitic  function,  consider  the  following  preliminary 
definitional  footnote  of  Spotts  1953: 

We  have  chosen  to  consider  the  intonational  unit 
described  by  Miss  Pike  as  a  phonological  word,  defined 
among  other  characteristics  (for  which  see  her  paper)  by 
intonemic  placement,  since  each  intoneme  (or  intoneme 
sequence)  ends  a  word.  Phonological  words  consist 
morphologically  of  an  optional  proclitic  followed  by  a 
simple  or  compound  stem  which  in  turn  is  sometimes 
followed  by  an  infix  and/or  an  enclitic.  A  simple  stem 
is  composed  of  a  root  followed  by  a  stem  formative.  A 
compound  stem  is  composed  of  two  simple  stems  in  sequence 
or  an  abbreviation  of  these. 

Instead  of  trying  to  survey  here  the  vast  literature  on  the 
notion  "word",  when,  after  all,  that  is  not  itself  the  primary  aim 
of  our  present  study,  let  us  sample  some  informed  and  serious 
general  statements  on  this  accepted,  clearly  useful,  intuitively 
grasped  notion  that  seems  to  apply  in  practically  all  human 
languages  that  have  been  attentively  investigated.  For  some 
concise  and  clear  remarks  in  simple  terms  we  may  first  note  from 
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Crystal's  (1987:91)  entry  words: 

Words  sit  uneasily  at  the  boundary  between 
morphology  and  syntax.  In  some  languages 

- 'isolating'  languages,  such  as  Vietnamese — 

they  are  plainly  low-level  units,  with  little 
or  no  internal  structure.  In  others — 

'polysynthetic'  languages  such  as  Eskimo — 
word-like  units  are  highly  complex  forms, 
eguivalent  to  whole  sentences.... 

Because  a  literate  society  exposes  its 
members  to  these  units  from  early  childhood, 

we  all  know  where  to  put  the  spaces  -  apart 

from  a  small  number  of  problems,  mainly  to  do 
with  hyphenation. . . . 

It  is  more  difficult  to  decide  what  words 
are  in  the  stream  of  speech,  especially  in  a 
language  that  has  never  been  written  down. 

And  we  may  excerpt  from  Crystal's  (1987:104)  section  on  "semantic 
structure" : 

People  readily  talk  about  the  'meaning  of 
words'.  However,  if  we  wish  to  enguire  precisely 
into  semantic  matters,  this  term  will  not  do.... 

1 .  The  term  word  is  used  in  ways  that  obscure  the 
study  of  meaning.  The  forms  walk ,  walks ,  walking , 
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and  walked  could  all  be  called  'different  words'; 
yet  from  a  semantic  point  of  view,  they  are  all 
variants  of  the  same  underlying  unit,  'walk' .  If 
the  variants  are  referred  to  as  'words' ,  though, 
what  should  the  underlying  unit  be  called?... 

2.  The  term  word  is  useless  for  the  study  of 
idioms,  which  are  also  units  of  meaning.... 

3.  The  term  word  has  in  any  case  been  appropriated 
for  use... in  the  field  of  grammar,  where  it  does 
sterling  service  at  the  junction  between  syntax  and 
morphology. . . . 

For  such  reasons,  most  linguists  prefer  to  talk 
about  the  basic  units  of  semantic  analysis  with 
fresh  terminology,  and  both  lexeme  and  lexical  item 
are  in  common  use....  It  is  lexemes  that  are 
usually  listed  as  headwords  in  a  dictionary. 

A  more  sophisticated  statement  concerning  the  perplexing  and 
complex  nature  of  the  criteria  for  discerning  these  units  is  to  be 
found  in  the  article  Words  by  E.  M.  Uhlenbeck  in  Bright  (1992,  vol, 
4:  246-8),  and  references  therein.  We  excerpt  here  some  leading 
parts  of  that  excellent  article,  which  should  be  consulted  in  its 
full  text  and  followed  up  through  its  informed  references. 

For  the  sentence  as  well  as  for  the  word, 
many  definitions  have  been  proposed;  but  so 
far  none  has  gained  general  acceptance.... 
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This  lack  of  consensus  among  linguists  stands 
in  sharp  contrast  to  the  general  agreement  of 
native  speakers  everywhere,  who  seem  convinced 
that  they  have  words  at  their  disposal  for 

daily  use  in  actual  speech. 

Four  basic  issues  are  involved  in  the 

problem  of  defining  the  word: 

1.  Is  the  word  a  universal?  That  is, 
...  is  it  part  of  every  language? . . . 

2.  What  kind  of  unit  is  a  word?  Are 
words  grammatical  units...?  Or  are  they 
primarily  SIGNS,  i.e.  units  of  form  and 
meaning. . . ? 

3.  How  are  words  related  to  sentences? 
Is  every  sentence...  analyzable  into  a 
seguence  of  words . . . ? 

4.  What  is  the  position  of  the  word  in 
language  structure?  Does  it  occupy,  as  was 
traditionally  assumed,  a  central  position  in 
grammar . . . ? 

The  answers  given  to  all  these  questions 
largely  depend  on  the  theoretical  position 
adopted.  On  the  European  continent,  a 
majority  of  linguists  tended  to  accept 
Saussure's  conclusion  that...  (Saussure 
1916:159)  seems  to  agree  with  what  laymen 
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generally  feel  to  be  true.  The  Prague  School 
as  well  as  Dutch  structuralists...  built  on 
this  view; . . . 

In  Great  Britain,  however,  . . .the  word 
was  viewed  as  a  grammatical  unit....  Other 
linguists  seemed  to  have  some  reservations 
about  the  universality  of  the  concept:... 

In  American  linguistics,  the  Neo- 
Bloomf ieldians— inf luential  between  1940  and 
19  60— generally  considered  not  the  word  but  the 
morpheme  as  the  smallest  and  basic  grammatical 
unit....  The  question  of  universality  was 
rarely  broached  before  1970.... 

Chomsky  1970  marks  the  beginning  of  a  new 
period  in  which  the  lexicon  and  morphology, 
fields  previously  neglected  in  generative 
grammar,  were  recognized  as  important  domains 
of  linguistic  research...  This  led  to  a 
renewed  interest  in  the  word,  its  role  in 
grammar,  and  its  semantic  nature.  This  has 
resulted  in  psycholinguistic  studies ..., and 
studies  within  a  generative  framework—.  .  . 

Since  the  word  may  be  the  locus  of 
morphonological ,  morphological,  and  syntactic 
regularities,  it  seems  unlikely  that  one 
single  criterion  will  suffice  to  distinguish 
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words  from  all  other  linguistic  entities. 


The  great  Danish  linguist  Hjelmslev  (1970/i.e.  1963  from  ca 
1943:32)  provides  us  with  a  particularly  pure  Saussurean  statement 

Every  language  appears  to  us  first  of  all 
as  a  system  of  signs ,  that  is  to  say,  a  system 
of  expression  units  that  have  content,  or 
meaning,  attached  to  them.  Words  are  signs  of 
this  sort.  But  parts  of  words  can  also  be 
signs :-s  in  English  is  a  sign  of  the  genitive 
(Jack'  -s  father)  and  a  sign  of  the  third 
person  singular  present  (he  write-s).  [the 
example  in  the  Danish  original  is  better  and 
clearer  in  the  latter  instance,  since  the  -s 
is  the  Danish  passive]  A  word  like  in-act-iv- 
ate-s  [the  Danish  example  more  closely 
resembles  un-like-ly ,  with  three  parts]  is  a 
sign  consisting  of  five  different  smaller 
signs.  A  sign  may  consist  of  one  expression 
element  with  one  content  element  attached  to 
it,  like  the  English  sign  -s  in  Jack's  father, 
which  consists  of  the  expression  element  s 
with  its  attached  content  element 
'genitive' ; or  it  may  be  formed-both  on  the 
expression  side  and  on  the  content  side— by  the 
combination  of  two  or  more  elements,  like  the 
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Latin  sign — arum  in  bon-arum  mulierum  'of  the 
good  women',  which  consists  of  four  expression 
elements — a,  r,  u,  and  m —  and  three  content 
elements— '  genitive '  ,  'plural',  and  ' feminine ' . 

And  he  follows  this  on  page  91  with  the 
characteristically  exact  and  laconic  observations: 

The  only  linguistic  typology  to  achieve  a 
place  in  classical  linguistics  was  a 
classification  according  to  linguistic  usage. 
The  central  point  of  interest  was  the 
structure  of  signs,  especially  of  words. 
Words  are  permutable  signs,  signs  that  can 
exchange  places  within  a  linguistic  chain: 
softly  answered  consists  of  two  words,  because 
one  can  also  say  answered  softly;  soft-ly  and 
answer-ed  each  consist  of  two  signs,  but  these 
signs  cannot  be  put  in  another  order. 
Permutable  signs  attracted  an  extraordinary 
amount  of  attention  from  classical 
linguistics,  beginning  with  antiguity,  since 
it  was  thought,  in  connexion  with  Aristotelian 
conceptual  logic,  that  each  such  sign  stood 
for  one  concept. 


ensuing 


A  more  natural  English  example  might  be  slowly  nodded  and 
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nodded  slowly.  We  cannot  replicate  H jelmslev ' s  elegant  Danish 
example  of  'the  boy  runs',  where  the  Danish  article  attaches  to  the 
end  of  the  noun. 

The  seeming  conflict  in  criteria  invoked  for  the  formal 
recognition  of  so  widespread  a  notion  as  that  of  word  is  evident  in 
the  best  theoretical  linguistic  writing  of  all  periods  and  of  most 
schools  of  thought.  It  should  therefore  not  surprise  us  that  a 
recent  and  well  received  handbook  (Akmajian,  Demers,  Farmer,  and 
Harnish  1990:11-13)  tells  us  that  words  are  encoded  through 
features  of  phonology,  of  morphology,  of  syntax,  of  semantics,  and 
of  pragmatics — indeed  of  all  the  subfields  of  linguistics  discerned 
and  dealt  with  in  that  book. 

In  view  of  the  difficulties  encountered  in  arriving  at  a 
satisfactory  definition  of  a  "word, "  some  text  items  were 
eliminated  in  order  to  assemble  a  body  of  data  which  might  be 
generally  agreed  upon  as  consisting  of  "words." 

Moreover,  for  this  study  of  word  length  it  is  of  central 
importance  that  the  material  analyzed  be  composed  of  strongly 
finite  segments.  It  is  a  part  and  an  implication  of  the  structure 
of  language  that  its  elements,  such  as  words,  are  formed  by  rules 
that  may  indeed  be  iterative  but  that  they  are  formed,  or  stored 
(e.g.  in  the  dictionary),  in  lengths  that  are  not  unlimited;  these 
lengths  are  in  fact  governed  by  the  rules  of  grammar  that 
specifically  characterize  each  language.  It  is  therefore  of 
negligible  interest  to  our  enquiry  if  a  written  formation  can  be 
found  that  in  principle  may  be  of  unlimited  or  inordinate  length. 
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For  this  study,  the  corpus  of  text  items,  in  relation  to 
spacing,  has  been  dichotomized  into  "selected"  segments  and  "normal 
words."  The  "selected"  items  have  been  excluded  from  the  study; 
the  "normal  words"  are  viewed  as  our  first  priority  for  study, 
especially  since  they,  after  all,  must  be  the  prime  object  of 
learning  and  teaching  for  braille  instruction  and  since  they  will 
potentially  occur  in  almost  any  English  text,  i.e.  they  are  as  a 
set  maximally  context-free.  They  most  centrally  characterize 
English . 

As  in  previous  processing  of  the  Brown  Corpus  samples, 
apparent  errors  are  not  corrected.  The  Kudera  analysis  seems  to 
use  the  same  approach  of  letting  errors  stand.  In  cases  in  which 
the  classification  of  an  item  is  questionable  it  seems  better  to 
lean  toward  "selecting"  the  item  (that  is,  excluding  it  from  the 
study)  so  that  as  far  as  possible  the  remaining  text  can  be  viewed 
as  a  defensible  base  of  normal  words. 

The  "selected"  items  which  have  been  excluded  include1  the 
classes  below. 

We  intend  at  a  later  date,  when  time  permits,  to  study  the 
numerical  effects  of  these  exclusions,  especially  by  category  of 
"selected"  items.  It  is  already  apparent  to  us,  as  we  reflect 
since  initiating  our  task  of  selection  and  counting,  that  our 
totals  at  that  time  will  change  very  slightly  as  we  revise  some 
individual  judgments.  But  these  changes  will  be  minimal,  and  at 
present  would  necessitate  an  entire  wasteful  recalculation. 
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1.  Those  with  arbitrary  length  and  constituency. 

Numerals:  By  definition  these  may  be  undefinably  large.  They 
involve  signs  which  are  discretely  different  from  linguistic 
word  markers.  Moreover,  it  is  difficult  to  predict  the  number 
of  English  words  which  will  surface;  e.g.  a  hundred  and  five 
or  one  hundred  five.  We  refer  here  to  Arabic,  Roman  and 
letter-number  combinations. 

Acronyms:  One  cannot  always  say  how  many  (small  grammatical) 
words  are  represented.  Sometimes  the  acronym  itself  is  meant 
as  a  word;  often  not.  A  very  chancy  class. 

Mathematical  formulas:  These,  represented  in  the  Brown  Corpus 
by  the  symbol  "**f,"  are  arbitrary  with  respect  to  our  study 
since  they  obey  mathematical  but  not  linguistic  syntax.  We 
also  cannot  predict  precisely  how  they  will  be  vocalized  in 
language . 

Ellipses:  The  amount  of  text  so  treated  is  not  always  easy  to 
specify  or  depends  upon  parsing.  We  cannot  classify  zero 
here.  Perhaps  suspension  points  can  be  classed  with 
punctuation . 
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2.  Systematically  aberrant. 

Printer  ornaments:  These  are  not  language,  in  a  strict  sense. 

Greek  letters:  Representation  of  these  letters  in  Roman 

typography  requires  the  use  of  arbitrary  multiple-character 
symbols.  Transliteration  may  not  be  consistent. 

Hyphenated  words:  fade-in,  half -intensity .  Compounds  which 
are  partially  hyphenated  and  partially  spaced  are  treated  as 
hyphenated.  Examples:  often-hlood  thirsty,  mid-twentieth 

century.  English  also  has  no  upper  limit:  quasi--pay  as  you 
go.  Where  does  the  hyphen  belong?  Words  may  be  written  with 
hyphens  between  letters,  such  as  w-i-d-e.  Like  pig-Latin 
these  are  oddly  derived  from  "normal  words." 

Proper  names:  Names  of  persons;  headings  and  titles  of 

articles,  plays,  books,  other  publications;  names  of 

geographical  places  and  locations;  names  of  governmental  units 

♦ 

and  other  organizations;  names  of  specialized  equipment. 
Names  in  English  and  modern  styles  (e.g.,  literary  titles) 
have  few  limits.  Names  refer,  but  do  not  mean;  this  can  offer 
some  problems  of  judgment.  When  is  a  place  a  name?  There  is 
also  an  old  philosophical  problem  of  uniqueness  (the  sun — is 
that  a  name?).  Geographic  names  (and  ethnica  such  as  the 
IKung)  can  be  "foreign"  (see  below).  Names  of  organizations 


can  be  wordy  and  expansible.  How  many  words  are  there  in  a 
full  personal  name? 

Letters  representing  mathematical  variables:  These  items  may 
be  somewhat  like  proper  names.  In  the  phrases  "the  point  p" 
and  "the  horse  Secretariat"  are  not  the  terms  "p"  and 
"Secretariat"  linguistically  analogous? 

Debatable  items . 

Abbreviations:  Although  some  of  these  embrace  multiples  of 
words  (i.e.,  e.g.),  or  are  of  ambiguous  rendering  (e.g.  =  e 
g  or  for  example;  ca.  =  circa  or  about  or  approximately  or 
around)  ,  others  have  taken  on  a  life  of  their  own  as  words 
(etc.  as  an  adverb  etcetera ,  with  a  variant  ex  ( c )  et  (e )  ra  among 
others ) . 

Print  contractions:  In  cases  such  as  don't  or  and/or  it  is 
not  perfectly  clear  whether  we  are  to  assign  the  count  before 
or  after  contraction.  A  form  such  as  whaddya ,  which  tends  to 
belong  to  a  style  reflecting  colloquial  spoken  varieties, 
really  reflects  a  separate  class  of  "folk  phonetic"  spellings. 
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Foreign  words:  In  one  sense  these  could  be  called 
systematically  aberrant  and  dismissed  as  such;  it  could  be 
claimed  that  their  inclusion  in  our  count  would  imply  an  exact 
knowledge  of  all  the  world's  languages.  Yet  there  are 
specimens  that  have  surely  been  incorporated  in  our  language 
(wadi,  ladino,  samurai,  maharaj a ,  kanaka2,  a a2 ,  cosa  nostra2, 
Zeitgeist,  hajj,  mukluk,  kachina,  milpa) .  The  indecision  that 
our  tradition  systematically  imposes  on  us  is  considerably 
alleviated  for  German  speakers,  who  call  naturalized 
borrowings  (of  any  age)  Lehnworter  "loan-words,"  but  adoptions 
which  still  carry  a  foreign  flavor  Fremdworter  "foreign- 
words."  In  questionable  cases  the  braille  rule  of  identifying 
as  foreign  those  words  which  are  in  a  different  type  face  is 
applied. 

Titles:  These  form  a  troublesome  class.  In  combination  with 
names  they  might  be  considered  as  components  of  proper  names. 
Yet  alone  they  seem  to  function  as  "normal"  appellatives. 

Days  of  the  week  and  month  names:  Though  these  can  well,  on 
rules  of  capitalization,  be  regarded  as  proper  names,  they 
also,  as  grammatical  classes  with  finite  membership,  can  be 
taken  as  a  subtype  of  "normal  words." 
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We  may  therefore  note  residually  the  classes  of  items  not 
"selected,  "  i.e.  text  material  treated  as  part  of  the  group  of 
"normal  words": 

Words  in  vocabularies  of  specialized  subjects:  :  lats ,  reps. 

Dialectal  representations:  git,  'scuse. 

Historic  contractions:  The  combination  of  cabin  and  hut  contained 
in  the  word  cahoots  has  somewhat  receded  into  the  past. 

Spaced  Compounds:  While  for  this  study,  the  foregoing  items,  in 
general,  can  be  identified  by  simple  inspection,  a  somewhat 
different  type  of  problem,  for  which  proper  identification  may 
require  a  considerable  amount  of  judgment,  is  presented  by  the 
spaced  compounds .  These  items  are  included  in  the  count  and  no 
doubt  constitute  an  exception  to  the  principle  of  "general 
agreement"  regarding  the  content  of  the  data  upon  which  the  study 
is  based.  In  such  compound  terms  as  grand  jury  and  social 
contract,  the  elements,  grand,  jury,  social,  and  contract ,  are 
treated  as  individual  words  even  though  our  native  linguistic  sense 
might  lead  us  to  regard  the  whole  span  as  a  single  semantic  word. 

Since  the  focus  of  the  study  was  primarily  upon  the  words 
found  in  the  text  rather  than  upon  the  discourse  structure  of  the 
text,  capitalization,  italicizing,  and  punctuation  markings  were 
ignored . 
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The  text  samples,  after  modification,  contained  a  total  of 
45408  words.  In  this  total  and  in  the  following  tables,  a  "word" 
corresponds  to  what  is  called  a  "token"  in  the  Brown  Corpus 
introductory  description.  That  is,  for  multiple  occurrences  of  the 
same  lexical  word,  each  occurrence  of  the  word  in  the  text  is 
registered  in  the  count. 

In  order  to  make  the  tabulations,  each  word  was  assigned  a 
length  expressed  as  a  three-number  combination:  C,N,S.  The  first 
number  is  the  count  of  the  print  characters  in  the  word.  The 
second  is  the  count  of  braille  units  in  the  braille  translation  of 
the  word.  The  third  number  is  the  count  of  braille  shapes.  For 
example,  for  the  word  father  C,N,S  has  a  value  of  6,1,2.  It 
contains  six  characters  in  print;  in  braille  it  consists  of  one 
braille  unit  and  has  two  braille  shapes.  The  length  of  the  word 
tension  can  be  expressed  as  7,3,4.  It  has  seven  letters,  three 
braille  units,  and  in  braille  is  written  using  four  shapes 
occupying  four  cells. 

The  first  Table  gives  the  print  character  lengths,  braille 
unit  lengths,  and  braille  shape  lengths  of  the  words  found  in  the 
text.  Specifications  of  lengths  are  contained  in  column  L.  Column 
C  shows  the  number  of  text  words  having  print  character  lengths 
equal  to  the  specified  lengths  in  L.  Column  N  shows  the  number  of 
words  which,  when  translated  into  braille,  have  lengths  in  braille 
units  equal  to  L.  In  column  S  are  found  the  number  of  words  having 
lengths  in  braille  shapes  equal  to  the  L  lengths.  The  columns  to 
the  right  of  C,  N,  and  S  give  the  percentages  of  words  having  each 
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length  and  the  cumulative  percentages  of  words  having  lengths  egual 
to  or  less  than  the  specified  lengths. 

Because  of  the  many  contractions  (phonograms,  morphograms,  and 
logograms)  which  are  a  major  feature  of  the  braille  code,  braille 
words  tend  to  have  fewer  characters  than  corresponding  words  in 
print.  The  analysis  gives  some  details  in  regard  to  specific 
effects  of  this  compression. 

While  the  majority  of  words  in  both  print  and  braille  are 
relatively  short,  braille  has  a  greater  concentration  of  short 
words.  As  shown  in  Table  1,  the  most  frequently  occurring  word 
length  in  print  is  a  length  of  three  characters.  In  braille  the 
most  frequent  length  both  in  terms  of  units  and  shapes  is  a  length 
of  one.  In  braille  more  than  one-half  of  the  words  have  two  or 
fewer  units  or  shapes.  In  print  a  maximum  character  length  of  two 
accounts  for  one  about  one-sixth  of  the  words.  In  the  Table  there 
appear  to  be  several  points  at  which  braille  word  lengths  (N  and  S) 
correspond  in  frequency  to  print  character  lengths  (C)  greater  by 
two.  Approximately  90  percent  of  the  words  have  six  or  fewer  units 
and  shapes  but  up  to  eight  characters.  Similarly,  the  percent  of 
words  which  contain  ten  or  fewer  braille  units  and  shapes  is  about 
the  same  as  the  percent  of  words  made  up  of  twelve  or  fewer  print 
characters . 

We  may  paraphrase  the  foregoing  observations  in  somewhat 
different  quantitative  terms:  Over  1/4  of  all  inkprint  words 
comprise  3  letters;  there  is  then  a  bell— shaped  decrement  over 
nearly  20  letters  of  length.  Braille  shows  a  strong  contrast  in 
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this  respect.  Over  2/5  of  all  words,  or  4  words  in  9,  consist  of 
but  one  braille  unit.  For  greater  lengths  the  decrement  is 
stepwise:  to  5,  then  to  7,  to  9,  then  10,  and  to  14  in  exiguous 
numbers.  Thus  we  see  that  braille  units  are  1/2  again  (i.e.  a 
ratio  of  43.5  to  26.3)  as  economical  as  inkprint  characters  in  the 
most  frequently  occurring  length.  In  other  words,  the  compression 
has  been  introduced  in  a  highly  useful  set  of  items,  and  any 
improvement  in  the  code  in  this  regard  would  presumably  have  to 
consider  items  one  by  one  in  terms  of  text  occurrence. 

With  the  exception  of  lengths  of  one,  shape  counts  come  close 
to  braille  unit  counts;  except  for  one  interesting  inversion  at 
length  5,  the  shape  count  is  always  slightly  higher  than  the  unit 
count.  Now  we  may  remove  the  anomalies  of  lengths  one  and  two 
substantially  by  summing  them;  this  clearly  results  from  the 
analytic  fact  that  braille  units  often  consist  of  two  shapes.  We 
may  draw  from  this  an  important  consequence  of  braille  shape 
design:  It  appears  that  braille  units  derive  their  diversity 
(recognition  characteristics)  without  exploiting  undue  shape 
complexity.  To  attain  this  diversity  ( optimization  of  word 
length)  otherwise  would  entail  packing  more  information  in  single 
braille  shapes. 
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L 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

16 

17 


Table  1 


cum. 

cum. 

per¬ 

per¬ 

per¬ 

per¬ 

per¬ 

c 

cent 

cent 

N 

cent 

cent 

S 

cent 

1355 

2.984 

2.984 

19758 

43.512 

43.512 

16360 

36.029 

6152 

13.548 

16.532 

4766 

10.496 

54.008 

7415 

16 . 330 

11944 

26 . 304 

42.836 

4066 

8.954 

62.962 

4284 

9.434 

7379 

16.250 

59.086 

4852 

10.685 

73 . 647 

4904 

10 . 800 

5277 

11.621 

70.707 

4406 

9.703 

83.350 

4315 

9.503 

3307 

7.283 

77 . 990 

3063 

6.746 

90.096 

3165 

6.970 

3252 

7 . 162 

85 . 152 

2126 

4.682 

94.778 

2264 

4 .986 

2525 

5.561 

90.713 

1226 

2.700 

97.478 

1339 

2.949 

1734 

3.819 

94.532 

609 

1.341 

98.819 

743 

1 . 636 

1130 

2.489 

97.021 

307 

.676 

99.495 

358 

.788 

693 

1.526 

98.547 

147 

.324 

99.819 

168 

.  370 

371 

.817 

99 . 364 

54 

.119 

99.938 

58 

.  128 

170 

.374 

99.738 

23 

.  051 

99.989 

30 

.066 

82 

.  181 

99.919 

5 

.011 

100.000 

5 

.011 

29 

.064 

99.933 

5 

.011 

99.994 

3 

.007 

100 . 001 

cum. 

per¬ 

cent 

36 . 029 

52 . 359 

61.793 

72.593 

82 . 096 

89 . 066 

94.052 

97.001 

98.637 

99.425 

99.795 

99.923 

99.989 

100 . 000 
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Table  2  lists  the  most  frequently  occurring  of  the  three- 
number  combination  lengths  assigned  to  the  words.  Length  values 


are  listed  which  characterize  one  percent  of  all  the  words  or  more. 


Table  2 


3/1/1 

8764 

19 . 301 

2,1,1 

3459 

7 . 618 

2,2,2 

2693 

5.931 

4,1,1 

2240 

4.933 

4,4,4 

2123 

4.675 

3,3,3 

1507 

3 .319 

4,3,3 

1498 

3.299 

5,4,4 

1435 

3 . 160 

5,5,5 

1421 

3 . 129 

1,1,1 

1355 

2 .984 

6,5,5 

1259 

2.773 

5,1,2 

1144 

2.519 

3,2,2 

995 

2 . 191 

7,6,6 

921 

2.028 

4,1,2 

867 

1.909 

6,6,6 

781 

1.720 

7,5,5 

765 

1 . 685 

3,1,2 

678 

1.493 

7,7,7 

648 

1.427 

8,7,7 

613 

1.350 

8,6,6 

609 

1 . 341 

6,4,4 

573 

1.262 

4,2,2 

543 

1 . 196 

5,1,1 

470 

1.035 

5,3,3 

451 

.993 

Of  the  various  three-number  lengths  assigned  to  the  words  in 
the  text,  Table  2  shows  that  the  length  3,1,1  (three  print 
characters,  one  braille  unit,  and  one  braille  shape)  is  by  far  the 
most  frequent.  Words  of  this  type  occur  more  than  two  and  one-half 
times  as  often  as  words  of  any  other  length. 

Certain  patterns  in  the  most  frequently  occurring  lengths 
suggested  further  tabulations.  Table  3  shows  the  frequencies  of 
words  whose  lengths  have  a  configuration  of  C  =  N  =  S.  In  Table  4 
there  are  counts  of  words  represented  by  one  unit  and  one  shape  in 
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braille  so  that  N  =  S  =  1. 


Table  3 

1,1,1 

1355 

2.984 

2,2,2 

2693 

5.931 

3,3,3 

1507 

3.319 

4,4,4 

2123 

4 . 675 

5,5,5 

1421 

3 . 129 

6,6,6 

781 

1.720 

7,7,7 

648 

1.427 

8,8,8 

390 

.859 

9,9,9 

197 

.434 

10,10,10 

119 

.262 

11,11,11 

57 

.  126 

12,12,12 

10 

.022 

13,13,13 

8 

.018 

14,14,14 

1 

.002 

11310 

24.908 

According  to  Table  3  about  one-fourth  of  the  words  have  an 
equal  number  of  characters,  units,  and  shapes.  These  lengths 
represent  words  which  do  not  contain  braille  contractions,  i.e., 
are  composed  of  alphabetic  letters. 


Table  4 

1,1,1 

1355 

2.984 

2,1,1 

3459 

7 .618 

3,1,1 

8764 

19.301 

4,1,1 

2240 

4.933 

5,1,1 

470 

1.035 

6,1,1 

68 

.  150 

9,1,1 

4 

.009 

16360 

36.030 

The  most  prevalent  three-number  lengths  also  indicate  that 
many  words  have  one  or  two  fewer  components  in  braille  than  in 
print.  In  Tables  5  and  6  counts  of  these  words  are  listed. 
Lengths  in  Table  5  have  the  pattern  N  =  S  =  C  -  1.  Table  6 
displays  the  numbers  of  words  having  aN=s=c-2  length  pattern. 
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Table  5 


2,1,1 

3459 

7 .618 

3,2,2 

995 

2 . 191 

4,3,3 

1498 

3.299 

5,4,4 

1435 

3 .160 

6,5,5 

1259 

2 . 773 

7,6,6 

921 

2 . 028 

8,7,7 

613 

1 . 350 

9,8,8 

358 

.788 

10,9,9 

184 

.405 

11,10,10 

102 

.225 

12,11,11 

55 

.  121 

13,12,12 

22 

.048 

14,13,13 

9 

.020 

15,14,14 

3 

.007 

10913 

24.033 

Table  6 

3,1,1 

8764 

19 . 301 

4,2,2 

543 

1.196 

5,3,3 

451 

.993 

6,4,4 

573 

1.262 

7,5,5 

765 

1 . 685 

8,6,6 

609 

1.341 

9,7,7 

388 

.854 

10,8,8 

182 

.401 

11,9,9 

93 

.205 

12,10,10 

35 

.077 

13,11,11 

21 

.046 

14,12,12 

12 

.026 

15,13,13 

6 

.013 

16,14,14 

1 

.  002 

12443 

27.402 

From  the  list  of  the  most  frequently  occurring  lengths  (Table 
2)  it  is  notable  that  many  words  are  translated  into  one  unit  and 
one  shape  in  braille.  Table  4  shows  that  the  total  number  of  these 
words  is  36  percent.  Other  prominent  configurations  in  the  list  of 
most  frequent  word  lengths  are  those  in  which  the  number  of  units 
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and  shapes  is  one  less  or  two  less  than  the  number  of  print 
characters.  Tables  5  and  6  show  that  words  having  these  length 
combinations  make  up  more  than  one-half  of  the  text.  Tables  5  and 
6,  of  course,  include  some  counts  from  Table  4. 

The  last  three  tables  present  the  interactions  among  the  three 
factors:  print  character  length,  braille  unit  length,  and  braille 
shape  length.  In  Table  7,  the  numbers  to  the  left  are  character 
lengths  and  column  headings  at  the  top  are  unit  lengths . 
Similarly,  Table  8  shows  the  relationships  between  character 
lengths  and  braille  shape  lengths,  and  Table  9  shows  the 
relationships  between  braille  unit  and  shape  lengths.  From  the 
three  tables  may  be  found  the  word  count  for  any  combination  of  two 
of  the  three  length  factors,  C,  N,  and  S.  Each  table  is  followed 
by  diagonal  totals  which  serve  to  summarize  relationships  among  the 
factors . 


Table  7 


Table  8 
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Tables  7  and  8  indicate  that  while  there  is  a  correspondence 
between  word  lengths  in  print  and  braille,  for  any  one  print 
character  length  a  wide  range  of  braille  unit  lengths  or  braille 
shape  lengths  may  be  found.  For  example:  words  having  nine 
characters  may  be  represented  by  anywhere  from  one  to  nine  units  in 
braille;  words  of  eleven  characters  on  a  print  page  may  occupy  from 
three  to  eleven  shapes  in  braille. 

There  is  rather  close  correspondence  between  braille  units  and 
braille  shapes  as  in  evident  from  Table  9.  More  than  85  percent  of 
the  braille  words  in  the  study  have  the  same  number  of  shapes  as 
units . 

The  tabulations  resulting  from  this  study  provide  a  profile  of 
word  lengths  in  braille  as  compared  with  word  lengths  in  print 
text.  It  is  hoped  that  this  information  will  be  of  interest  and 
helpful  for  aspects  of  braille  education. 

An  ability  to  specify  the  range  of  length  found  in  words  in 
text  constitutes  for  a  language  such  as  English  an  important 
characterizing  feature  of  the  nature  and  texture  of  discourse  and 
written  matter  in  that  language. 


Table  9 
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Notes 

'Our  choice  of  the  vague  word  "include"  is  deliberate.  Anyone  who 
sits  down  seriously  to  this  job  will  find  rapidly  that  the  borders 
to  judgment  become  perplexing,  and  the  more  one  consults  reflective 
philosophers  and  linguists  the  more  the  fuzzy  and  gray  boundaries 
multiply . 

2Listed  in  the  1968  Random  House  Dictionary,  College  Edition,  but 
not  in  Webster's  Ninth  New  Collegiate  Dictionary  (1983). 
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Abstract 

The  so-called  lower-sign  words  form  a  salient  class  of  the  American 
braille  code.  The  incidence  of  these  braille  shapes,  as  well  as 
that  of  inkprint  sequences  which  avoid  such  braille  representation, 
deserve  careful  and  analytically  explicit  study.  Nine  Grade  2 
braille  words  define  the  framework  for  the  investigation.  The  text 
sample  serving  as  data  base  for  the  observations  has  been  taken  in 
principled  fashion  from  the  Brown  Corpus  as  described  in  the  second 
paper  of  this  collection  and  from  a  recent  number  of  the  Readers' 
Digest,  and  categorized  in  accordance  with  Hamp  and  Caton  (1984). 
The  occurring  categories  of  representation  are  defined,  and  their 
incidence  by  text  subcategory  tabulated  for  frequency  and  for 
proportionate  occurrence.  Comparative  data  is  offered  on  the 
incidence  of  braille  units  and  inkprint  for  the  letter  sequences  in 
question.  Some  initial  comment  is  offered  on  various  aspects  of 
the  numerical  findings,  including  the  implication  of  importance  for 
the  learning  and  teaching  of  braille. 


The  Text  Frequency  and  Incidence 
of  Lower-Sign  Sequences  in  American  Braille 
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Introduction:  Scope  of  the  Study 

The  lower-sign  words  of  braille  form  a  class  which  has 
justifiably  attracted  the  attention  of  scholars  and  users  of 
braille.1  The  mere  fact  that  those  closest  to  braille  have 
perceived  them  as  a  group  indicates  that  they  must  engage  our 
serious  attention  in  a  systematic  way  if  we  are  to  study  the 
braille  code  with  the  care  and  structured  precision  which  it 
deserves.  That  is  of  course  a  declaration  of  intellectual 
obligation,  a  duty  which  we  owe  to  one  aspect  of  our  strivings  to 
understand  and  explain  the  mechanics  of  human  behavior  and 
cognition.  There  is  also  a  practical  side  to  such  a  concern  which 
is  of  obvious  importance:  The  attentive  study  of  such  a  class  of 
signs  and  their  surrogates  must  surely  yield  in  the  end 
implications  for  the  sound  teaching,  effective  use,  and  wise 
administration  of  the  braille  code.  Valid  instruction,  viable 
instructional  materials,2  and  even  revisions  in  the  code,  envisaged 
or  mooted,  must  be  grounded  in  the  observation  of  relevant  data  and 
in  the  analysis  and  formulation  of  its  systematic  properties.  The 
analysis  must  be  exact  and  informed. 

The  lower-sign  words  form  a  set  which  may  in  an  obvious  way  be 
defined  as  having  as  members  logograms  characterized  by  a 
restricted  dot  configuration  in  their  shapes.  The  physical  and 
perceptual  attributes  and  consequences  of  this  concentration  in  the 
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disposition  of  their  dots  have  been  studied  long  since  by  scholars 
of  braille.3  It  is  also  clear  to  any  grammarian  that  the  English 
words  represented  by  these  lower- sign  shapes  are  for  the  most  part 
highly  basic  short  working  elements  of  English  grammar. 

The  purpose  of  the  present  study  is  to  investigate  the 
incidence  of  occurrence  and  avoidance  of  the  lower— sign  words  and 
syllables  in  normal  running  American  English  text.  It  is  intended 
eventually  to  study  the  distribution  and  dependencies  of 
occurrence,  i.e.  the  syntax  or  syntactic  properties  in  the  broadest 
sense  of  these  signs  and  allied  sequences  in  relation  to  other 
braille  elements  and  to  the  grammatical  syntax  of  English.  As  a 
first  step  in  this  direction  we  present  here,  largely  in  tabular 
form  and  with  some  commentary,  the  numerical  account  of  the 
incidence  of  the  lower  signs  and  their  surrogates  in  a  significant 
sample  of  various  genres  of  American  English  prose. 

We  must  first  distinguish  the  categories  of  use  of  those  signs 
and  their  related  sequences. 

In  Grade  2  braille  the  words  which  are  designated  as  lower- 
sign  words  are  nine  in  number,  (be) ,  (by),  (enough),  (his),  (in), 
(into),  (to),  (was)  and  (were)  .  These  lower-sign  words  which  are 
also  logograms,  are  represented  by  cells  in  which  only  the  lower 
two  rows  are  occupied  by  dots. 

In  order  to  study  the  occurrence  of  these  words  in  samples  of 
braille  text,  we  selected  a  corpus  consisting  of  sections  taken 
from  the  Brown  Corpus  and  of  articles  from  the  April  1991  issue  of 
the  Reader  s  Digest  translated  into  braille.  The  sections  from  the 
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Brown  Corpus  had  been  selected  in  the  manner  and  according  to  the 
principles  described  by  the  authors  in  the  second  paper  of  this 
collection . 

The  letter  sequences  which  form  these  English  words  in  print 
may  be  represented  in  various  ways  in  braille.  In  many  cases,  of 
course,  we  are  concerned  with  the  whole  English  word.  The  English 
word  may  be  contracted,4  partly  contracted,5  or  uncontracted6  in  its 
braille  representation.  In  some  cases,  certain  of  these  print 
sequences,  which  can  in  themselves  be  contracted,4  may  occur  as 
part  of  a  longer  sequence  of  inkprint  letters  for  which  there  is  a 
different  braille  contraction, 4  itself  a  logogram  in  the  terms  of 
Hamp  and  Caton  (1984).  Also,  these  inkprint  sequences  often 
constitute  parts  of  other  words.7  Such  words  may  or  may  not  have 
meanings  related  to  the  meanings  which  the  sequences  have  when  they 
stand  alone.8 

Categories  of  Representation 

From  the  sample  text  it  appeared  that  there  were  18  possible 
different  relations  between  braille  representations  and  the 
inkprint  letter  sequences  of  interest  to  a  consideration  of  lower 
signs  in  braille.  In  each  case  the  situation  is  characterized  in 
brief  form  and  in  terms  that  will  be  familiar  to  users,  teachers, 
and  scholars  of  braille  within  a  long-standing  tradition,  and  the 
categorization  of  Hamp  and  Caton  (1984)  follows  in  parentheses. 
There  is  in  what  follows  no  essential  departure  from  the  current 
practice  and  system  of  braille.  The  novelty  lies  in  the 
completeness  of  the  sample  and  display  and  in  the  consistency  of 
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the  analytic  categorization  presented. 

1.  Whole  words  contracted  such  as  (be),  (his),  etc. 

( logograms ) 

2.  Whole  words  partly  contracted  because  of  being  adjacent 
to  another  lower-sign  word.  (An  example  would  be  the 
partial  contracting  of  (enough)  in  (to  ) (en) (ou) (gh) . 
This  category,  strictly  consisting  of  phonograms,  while 
possible,  did  not  seem  to  be  found  in  the  sample.) 

3 .  Whole  words  partly  contracted  because  proximity  to  the 

r 

end  of  the  braille  line  did  not  permit  annexing  the  next 
word.  for  example  (in) to  (phonograms)  (alphabetic 
letters ) 

4.  Whole  words  partly  contracted  because  of  adjacent 
punctuation.  ( phonograms )  (alphabetic  letters) 

5.  Whole  words  uncontracted  because  of  being  adjacent  to 
another  lower-sign  word.  (alphabetic  letters) 

6 .  Whole  words  uncontracted  because  proximity  to  the  end  of 
the  braille  line  did  not  permit  annexing  the  next  word, 
(alphabetic  letters) 

7.  Whole  words  uncontracted  because  of  adjacent  punctuation, 
(alphabetic  letters) 

8 .  Lower  signs  as  parts  of  contractions  and  maintaining 
their  shape.  An  example  is  (be)  as  part  of  (before)  in 
the  contraction  (be)f.  (parts  of  logograms) 
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9.  Lower-sign  surrogates  as  parts  of  contractions 

conventionally  viewed  as  showing  the  lower  sign  altered. 
An  example  is  to  as  part  of  (today)  in  the  contraction 
td.  (parts  of  logograms) 

10.  Inkprint  letter  sequences  as  sections  of  compound  words. 
An  example  is  in  in  drive-in.  (alphabetic  letters) 

11.  Inkprint  letter  sequences  as  parts  of  other  words, 

contracted,  but  unrelated  in  meaning.  Some  examples  are 
(in)  in  f(in)d  (phonograms)  or  (in)  in  (in)  adequate 
(morphograms ) . 

12.  Inkprint  letter  sequences  as  parts  of  other  words, 

contracted,  and  related  in  meaning.  An  example  is  (be) 
in  (be) (ing)  (phonograms)  or  (in)  in  (in) side 

(morphograms) . 

13.  Inkprint  letter  sequences  as  parts  of  other  words, 

uncontracted,  and  unrelated  in  meaning.  An  example  is  by 
in  baby.  (alphabetic  letters) 

14.  Inkprint  letter  sequences  as  parts  of  other  words, 

uncontracted,  and  related  in  meaning.  An  example  is  to 
in  onto.  (alphabetic  letters) 

15.  Inkprint  letter  sequences  as  parts  of  other  words,  in 
which  one  or  more  letters  of  the  lower-sign  letter 
sequence  are  combined  with  adjacent  letters  in 
contractions,  and  the  meaning  of  the  word  is  unrelated  to 
the  meaning  of  the  corresponding  lower-sign  word. 


16. 


Examples  are  his  in  (wh) isp  ( er )  ,  was  in  wa(sh)  (ed)  ,  and 
to  in  a(st) (ound) (ing) .  (phonograms)  (morphograms ) 
(alphabetic  letters) 

Inkprint  letter  sequences  as  parts  of  other  words ,  in 
which  one  or  more  letters  of  the  lower— sign  letter 
sequence  are  combined  with  adjacent  letters  in 
contractions ,  and  the  meaning  of  the  word  is  related  to 
the  meaning  of  the  corresponding  lower-sign  word.  An 
example  is  to  in  t(ow)(ar)d.  (phonograms) 

17.  Inkprint  letter  sequences  as  parts  of  other  words,  partly 

contracted,  and  unrelated  in  meaning.  An  example  is 
(in) to  in  m(ount) a ( in) top .  (phonograms)  (alphabetic 
letters )  ( morphograms ) 

18.  Inkprint  letter  sequences  as  parts  of  other  words  partly 
contracted,  and  related  in  meaning.  Some  examples  are 
w(er)(en)'t  and  (there) (in) to .  (phonograms)  (alphabetic 
letters)  (morphograms) 

In  some  cases  a  lower  sign  occurrence  may  belong  to  two 
categories.  Examples:  in  (to  )he ,  the  sequence  he  is  adjacent  to 
a  lower  sign  and  also  to  punctuation  and  therefore  falls  into 
categories  5  and  7;  in  1 (ow)  (er) (ed)  the  were  sequence  is  partly 
contracted  and  letters  from  the  sequence  combine  with  adjacent 
letters  to  form  contractions  so  that  this  occurrence  belongs  to 
categories  15  and  17. 
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Summary  of  Categories  of  Braille  Usage  of  Lower-Sign  Sequences 


1. 

Whole  word,  contracted 

2. 

Whole  word,  partly  contracted,  adjacent  to  lower-sign 

word 

3. 

Whole  word,  partly  contracted,  at  end  of  braille  line 

4  . 

Whole  word,  partly  contracted,  adjacent  to  punctuation 

5  . 

Whole  word,  uncontracted,  adjacent  to  lower  sign  word 

6. 

Whole  word,  uncontracted,  at  end  of  braille  line 

7. 

Whole  word,  uncontracted,  adjacent  to  punctuation 

8. 

Part  of  another  contraction,  shape  retained 

9. 

Part  of  another  contraction,  shape  altered 

10. 

Section  of  a  compound  word 

11. 

Part  word,  contracted,  unrelated  meaning 

12. 

Part  word,  contracted,  related  meaning 

13. 

Part  word,  uncontracted,  unrelated  meaning 

14. 

Part  word,  uncontracted,  related  meaning 

15. 

Part  word,  overlapped  contractions,  unrelated  meaning 

16. 

Part  word,  overlapped  contractions,  related  meaning 

17. 

Part  word,  partly  contracted,  unrelated  meaning 

i-* 

00 

• 

Part  word,  partly  contracted,  related  meaning 

We  try  in  this  study  to  dwell  on  central  issues  and 
characteristics  of  the  topic.  Because  we  attempted  to  cast  as  wide 
a  net  as  possible  three  types  of  phenomena  have  appeared  in  our 
data  which  seem  to  us  peripheral  in  varying  degrees;  they  result 
from  the  attempt  to  make  complete  correlations  with  ink-  print 
spelling . 


10 


1.  Probably  categories  13  and  14  are  of  less  central 

interest  to  our  topic  than  many  others,  but  they  are 
interesting  in  exploring  the  terrain  (e.g.  hiss ( ing )  )  and 
defining  the  limits,  and  they  are  probably  of  more 
concern  than  the  tricky  issues  of  15,  16,  and  17,  which 
involve  debatable  theoretical  questions  of  analysis . 

2.  Categories  15,  16,  and  17  forced  themselves  on  our 

attention  because  of  our  insistence  on  inspecting  all 
sequences  which  qualified  under  any  definition.  Many  of 
the  examples  here  could  be  eliminated  on  other  grounds 
and  we  may  wish  later  to  prune  our  list;  but  for  the 
present  we  feel  it  is  better  to  allow  our  lists  to  stand. 

3 .  There  are  a  few  items  which  have  surfaced  under 

categories  9,  15,  and  16  which  our  procedures  threw  up 

(e.g.  (this),  wa(sh) (ed) ,  (wh)i(st)ler  t(ow)(ar)d)  and 
which  to  varying  degrees  we  would  find  inconsistent  with 
our  analytic  principles  for  the  braille  code.  For  the 
present  we  allow  them  to  stand,  since  they  do  not  distort 
our  main  results. 

Brown  Corpus  Occurrences 

The  tabulations  which  follow  show  the  numerical  occurrence  in 
the  sample  text  corpus  (Brown  Corpus)  of  the  lower-sign  sequences 
which  have  been  identified  above  subcategorized  by  the  above  18 
syntagmatic  categories  (in  columns)  and  by  genre— specif ied  (A-R) 
text  corpus  source  (in  rows) .  Each  tabulation  is  identified  and 
titled  by  its  relevant  lower— sign  word;  but  it  should  be  noted 
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clearly  that  all  18  categories  of  sequence  are  accounted  for. 

The  Brown  Corpus  literature  types,  or  genres,  associated  with 
the  code  letters  in  the  first,  identifying,  column  are:  A.  Press: 
Reportage;  B.  Press:  Editorial;  C.  Press:  Reviews;  D. 
Religion;  E.  Skills  and  Hobbies;  F.  Popular  Lore;  G.  Belles 
Lettres,  Biography,  etc.;  H.  Miscellaneous;  J.  Learned  and 
Scientific  Writings;  K.  Fiction:  General;  L.  Fiction:  Mystery 
and  Detective;  N.  Fiction:  Adventure  and  Western;  P.  Fiction: 
Romance  and  Love  Story;  and  R.  Humor. 


Brown  Corpus 


Samples 


BE 

1  2  3  4  5  6  7  8  9  10  1 1  12  13  14  15  16 


AO  1 

20 

- 

- 

- 

1 

- 

— 

2 

— 

— 

4 

5 

2 

7 

14 

A21 

6 

- 

- 

- 

- 

- 

- 

6 

- 

- 

4 

3 

5 

14 

11 

BO  1 

15 

- 

- 

- 

2 

- 

- 

7 

- 

- 

9 

1 

4 

9 

9 

C01 

2 

- 

- 

- 

2 

- 

- 

2 

- 

- 

3 

3 

9 

6 

8 

DO  1 

7 

- 

- 

- 

2 

- 

- 

6 

- 

- 

5 

4 

6 

8 

3 

E01 

5 

- 

- 

- 

1 

- 

- 

18 

- 

- 

4 

- 

14 

2 

13 

E21 

33 

- 

- 

- 

4 

- 

1 

3 

- 

- 

3 

1 

4 

2 

6 

F01 

5 

- 

- 

- 

- 

- 

1 

4 

- 

- 

6 

- 

3 

4 

8 

F21 

9 

- 

- 

- 

2 

- 

- 

5 

- 

- 

3 

- 

1 

8 

12 

GO  1 

8 

- 

- 

- 

5 

- 

- 

9 

- 

- 

7 

3 

6 

16 

22 

G21 

6 

- 

- 

- 

3 

- 

1 

8 

- 

- 

4 

- 

4 

9 

25 

G41 

3 

- 

- 

- 

6 

- 

- 

6 

- 

- 

2 

- 

- 

4 

8 

G61 

5 

- 

- 

- 

4 

- 

1 

8 

- 

- 

8 

- 

2 

9 

10 

HOI 

11 

- 

- 

- 

3 

- 

- 

1 

- 

- 

- 

- 

1 

1 

4 

H21 

13 

- 

- 

- 

7 

- 

- 

4 

- 

- 

3 

2 

2 

10 

15 

J01 

16 

- 

- 

- 

5 

- 

- 

6 

- 

- 

3 

- 

4 

12 

12 

J21 

15 

- 

- 

- 

1 

- 

1 

1 

- 

- 

- 

1 

2 

1 

2 

J41 

13 

- 

- 

- 

1 

- 

- 

8 

- 

- 

7 

- 

3 

5 

4 

J61 

9 

- 

- 

- 

8 

- 

- 

4 

- 

- 

7 

2 

- 

6 

9 

KOI 

2 

- 

- 

- 

2 

- 

- 

3 

- 

- 

1 

- 

6 

2 

23 

K21 

5 

- 

- 

- 

3 

- 

1 

7 

- 

- 

6 

1 

13 

7 

8 

L01 

2 

- 

- 

- 

5 

- 

1 

6 

- 

- 

8 

- 

4 

6 

6 

NO  1 

4 

- 

- 

- 

1 

- 

- 

6 

- 

- 

- 

1 

6 

4 

6 

P01 

6 

- 

- 

- 

3 

- 

- 

18 

- 

- 

5 

- 

7 

5 

6 

R01 

5 

- 

- 

— 

5 

— 

— 

6 

_ 

_ 

6 

2 

9 

8 

3 

17 


Brown  Corpus 


Samples 


BY 

1 


AO  1 

9 

A21 

15 

BO  1 

11 

C01 

15 

DO  1 

5 

E01 

7 

E21 

7 

F01 

16 

F21 

20 

GO  1 

9 

G21 

11 

G41 

7 

G61 

7 

HOI 

11 

H21 

19 

J01 

14 

J21 

7 

J41 

16 

J61 

11 

KOI 

5 

K21 

5 

L01 

NO  1 

2 

P01 

6 

R01 

7 

2  3  4 


6  7  8  9  10  11  12  13  14  15  16  17  18 

1------2---- 

12---------- 


1 


1  ------  1 


16------- 

22------1 

21------- 

11------- 


1  -  2 


1 


1 


1 


2 


Brown  Corpus 


Samples 


ENOUGH 

1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17 

AO  1  -  --  --  --  --  --  --  --  - 

A21  ---------------- 

BO  1  --1------------ 

COi  ________________ 

DO  1 

E01  ________________ 

E21 

F01  l------------ 

F21  2--1------------ 

G01  -  -  -  -  -  - 

G21  - 

G41  l-------_-_-____ 

G61  -  ------- 

HOI 


H21 

J01 

1 

- 

J21 

J41 

J61 

5 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

KOI 

1 

- 

- 

1 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

K21 

1 

- 

- 

1 

- 

- 

- 

- 

- 

- 

- 

- 

- 

— 

- 

- 

- 

L01 

2 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

NO  1 

1 

P01 

2 

- 

- 

1 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

R01 

_ 

_ 

— 

1 

_ 

Brown  Corpus 


Samples 


HIS 


1 

2 

3 

4 

5 

6 

7 

8 

9 

10  11 

AO  1 

12 

- 

- 

- 

- 

- 

- 

- 

9 

- 

A21 

11 

- 

- 

- 

3 

- 

- 

- 

3 

- 

BO  1 

11 

- 

- 

- 

- 

- 

- 

- 

10 

- 

C01 

12 

- 

- 

- 

1 

- 

- 

- 

13 

- 

DO  1 

7 

- 

- 

- 

- 

- 

- 

- 

20 

- 

E01 

16 

- 

- 

- 

1 

- 

- 

- 

19 

- 

E21 

- 

- 

- 

- 

- 

- 

- 

- 

6 

- 

F01 

12 

- 

- 

- 

1 

- 

- 

- 

22 

- 

F21 

- 

- 

- 

- 

- 

- 

- 

- 

7 

- 

GO  1 

5 

- 

- 

- 

- 

- 

- 

- 

18 

- 

G21 

16 

- 

- 

- 

1 

- 

- 

- 

18 

- 

G41 

9 

- 

- 

- 

1 

- 

- 

- 

7 

- 

G61 

4 

- 

- 

- 

1 

- 

- 

- 

12 

- 

HOI 

2 

- 

- 

- 

- 

- 

- 

- 

6 

- 

H21 

- 

- 

- 

- 

- 

- 

- 

- 

19 

- 

JOl 

- 

- 

- 

- 

- 

- 

- 

- 

9 

- 

J21 

- 

- 

- 

- 

- 

- 

- 

- 

13 

- 

J41 

- 

- 

- 

- 

- 

- 

- 

- 

28 

- 

J61 

11 

- 

- 

- 

- 

- 

1 

- 

9 

- 

KOI 

58 

- 

- 

- 

2 

- 

- 

- 

1 

- 

K21 

41 

- 

- 

- 

3 

- 

- 

- 

3 

- 

L01 

3 

- 

- 

- 

- 

- 

- 

- 

12 

- 

NO  1 

21 

- 

- 

- 

1 

- 

- 

- 

11 

- 

POl 

29 

- 

- 

- 

2 

- 

- 

- 

3 

- 

ROl 

30 

— 

— 

_ 

3 

_ 

— 

_ 

2 

_ 

13  14  15  16 


3 

1 

2 

3 

3 

3 

2 

9 

3 


3 

2 

1 

1 

2 


Brown  Corpus  -  Samples 


IN 


1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

AO  1 

39 

- 

- 

- 

- 

- 

1 

3 

41 

- 

47 

2 

- 

- 

- 

A21 

51 

- 

- 

- 

- 

- 

- 

2 

63 

1 

48 

5 

- 

- 

1 

BO  1 

44 

- 

- 

- 

- 

- 

1 

2 

66 

- 

67 

6 

- 

- 

- 

C01 

33 

- 

- 

- 

- 

- 

- 

3 

72 

- 

69 

5 

- 

- 

- 

DO  1 

60 

- 

- 

- 

- 

- 

- 

9 

44 

- 

43 

5 

- 

- 

- 

E01 

30 

- 

- 

- 

- 

- 

- 

5 

58 

- 

59 

6 

- 

- 

- 

E21 

27 

- 

- 

- 

- 

- 

2 

- 

60 

- 

67 

6 

- 

- 

- 

F01 

39 

- 

- 

- 

- 

- 

- 

1 

100 

1 

71 

10 

- 

- 

- 

F21 

58 

- 

- 

- 

- 

- 

- 

4 

48 

- 

45 

10 

- 

- 

- 

GO  1 

46 

- 

- 

- 

- 

- 

- 

1 

49 

- 

62 

6 

- 

- 

3 

G21 

34 

- 

- 

- 

- 

- 

2 

2 

23 

- 

51 

6 

- 

- 

- 

G41 

33 

- 

- 

- 

- 

- 

2 

- 

50 

- 

42 

2 

- 

- 

1 

G61 

60 

- 

- 

- 

- 

- 

1 

2 

39 

- 

84 

15 

- 

- 

1 

HOI 

36 

- 

- 

- 

- 

- 

- 

- 

56 

- 

107 

17 

- 

- 

58 

H21 

69 

- 

- 

- 

- 

- 

- 

2 

65 

- 

66 

7 

- 

- 

1 

J01 

37 

- 

- 

- 

- 

- 

1 

- 

29 

- 

67 

2 

- 

- 

- 

J21 

44 

- 

- 

- 

- 

- 

- 

13 

36 

- 

172 

18 

- 

- 

- 

J41 

50 

- 

- 

- 

- 

- 

- 

- 

55 

- 

137 

4 

- 

- 

- 

J61 

56 

- 

- 

- 

- 

- 

- 

3 

84 

- 

66 

2 

- 

- 

- 

KOI 

35 

- 

- 

- 

- 

- 

- 

3 

54 

- 

61 

- 

- 

- 

- 

K21 

25 

- 

- 

- 

- 

- 

- 

9 

83 

- 

33 

1 

- 

- 

- 

L01 

24 

- 

- 

- 

- 

- 

- 

3 

62 

- 

40 

1 

- 

- 

1 

NO  1 

33 

- 

- 

- 

- 

- 

3 

9 

64 

- 

39 

- 

- 

- 

- 

POl 

33 

- 

- 

- 

- 

- 

1 

5 

90 

- 

43 

4 

- 

- 

- 

ROl 

33 

— 

— 

— 

- 

- 

- 

5 

56 

1 

57 

1 

_ 

_ 

. 
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Brown  Corpus 


Samples 


INTO 


1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18 


AO  1 
A21 
BO  1 
C01 
DO  1 
E01 
E21 
F01 
F21 
GO  1 
G21 
G41 
G61 
HOI 
H21 
J01 
J21 
J41 
J61 
KOI 
K21 
L01 
NO  1 
P01 
ROl 


2  - 

13  - 

3“  l------_______2 


Brown  Corpus 


Samples 


TO 

1  2  3  4  5  6  7  8  9  10  1 1  12  13  14  15  16  17 


AO  1  48 

A21  36 

BO  1  42 

C01  41 

DO  1  42 

E01  39 

E21  42 

F01  47 

F21  31 

GO  1  45 

G21  55 

G41  70 

G61  35 

HOI  49 

H21  32 

J01  28 

J21  13 

J41  50 

J61  75 

KOI  41 

K21  57 

L01  71 

NO  1  45 

P01  42 

R01  6  5 


-6231--  -  40 

4  -  2  15  1  -  -  14 

-6-231-  -  19 

-113---  -  18 

-4-91---5 
325-1--  19 

-6--1---23 
-2-11---6 
-3-41---8 
-61131--8 
7-2---  -  15 

-7-----  -  12 

-1-21---7 
7--1---20 
4-222-  -  11 

43-1--  -  16 

3  10  13  -  -  -  -  19 

-81-----4 
-6133--  -  19 

-4331--  -  18 

--5-9---  -  10 

2  1  3  1  -  -  -  12 

9  2  -  -  -  11 

--3-5---  -  13 

--4-51--  -  11 


3  1 

6  1 
-85 
5 

-  2  - 

5 

-  7  - 

1  9 

-  3  - 

6  2 

3  1 

-  7  - 

1  10  2 

2  1 

5 

3 

-  3  - 

113- 

5  1 

-  14  3 

-91 

-  8  4 

1  20  2 

9  2 


Brown  Corpus 


Samples 


WAS 

1  2  3  4  5  6  7  8  9  10  1 1  12  1 3  14  1 5  1 6  1 7 


AO  1 

18 

i 

i 

- 

- 

A21 

28 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

i 

- 

- 

- 

BO  1 

14 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

2 

- 

- 

C01 

8 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

1 

- 

~ 

DO  1 

E01 

5 

9 

___ 

; 

; 

_ 

__ 

_ 

__ 

__ 

; 

; 

i 

; 

; 

; 

E21 

F01 

12 

__ 

_ 

_ 

— 

— 

“ 

“ 

— 

i 

i 

* 

F21 

11 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

~ 

- 

i 

- 

- 

GO  1 

4 

G21 

22 

- 

- 

- 

- 

- 

i 

- 

- 

~ 

- 

- 

- 

- 

- 

- 

- 

G41 

53 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

3 

- 

- 

G61 

23 

- 

HOI 

11 

- 

- 

H21 

7 

- 

J01 

14 

1 

- 

- 

J21 

- 

J41 

J61  30  -----  1 

KOI  37  -  -  -  -  -  1 

K21  21 

L01  28 

NO  1  32 

P01  3  6 

R01  39 


2 


1 


Brown  Corpus 


Samples 


WERE 

1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17 


AO  1 

2 

A21 

17 

BO  1 

2 

1 

1 

C01 

DO  1 

3 

E01 

1 

E21 

F01 

3 

- 

- 

1 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

1 

1 

F21 

6 

GO  1 

1 

G21 

2 

G41 

12 

- 

G61 

10 

- 

HOI 

1 

- 

H21 

J01 

2 

J21 

1 

- 

J41 

J61 

13 

- 

- 

1 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

KOI 

7 

1 

- 

K21 

5 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

2 

- 

L01 

11 

- 

- 

1 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

1 

- 

NO  1 

13 

2 

- 

P01 

5 

R01 

7 

_ 

_ 

— 

_ 

— 

_ 

_ 

_ 

__ 

__ 

_ 

_ 

_ 

— 

21 


Occurrences  in  Reader's  Digest 

Shown  next  are  tabulations  of  the  occurrences  of  the  lower- 
sign  sequences  in  our  Reader's  Digest  text.  The  tabulations  again 
show  the  numerical  occurrence  in  the  sample  text  corpus  (Reader's 
Digest)  of  the  lower-sign  sequences  which  have  been  identified 
above  subctegorized  by  the  above  18  syntagmatic  categories  (in 
columns) .  Each  tabulation  is  identified  and  titled  by  its  relevant 
lower-sign  word;  but  it  should  be  noted  clearly  that  all  18 
categories  of  sequence  are  accounted  for.  Magazine  articles  are 
numbered  from  1  through  36,  the  first  page  of  each  article  being 
given  in  the  next  left-hand  column.  The  37th  section  consists  of 
humorous  items  found  throughout  the  magazine,  the  first  of  which  is 
on  page  16. 
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BE 

1  2  3  4  5  6 


1 

2 

7 

- 

- 

- 

- 

- 

- 

2 

13 

2 

- 

- 

- 

1 

- 

1 

3 

21 

2 

- 

- 

- 

- 

- 

- 

4 

27 

5 

- 

- 

- 

2 

- 

- 

5 

37 

3 

- 

- 

— 

6 

— 

— 

6 

47 

1 

— 

- 

- 

2 

- 

- 

7 

49 

4 

- 

- 

- 

- 

- 

- 

8 

51 

5 

- 

- 

- 

2 

- 

- 

9 

59 

1 

- 

- 

- 

1 

- 

- 

10 

63 

1 

- 

- 

- 

5 

— 

— 

11 

69 

5 

— 

— 

— 

3 

— 

- 

12 

73 

5 

- 

- 

- 

- 

- 

- 

13 

79 

4 

- 

- 

- 

3 

- 

- 

14 

84 

2 

- 

- 

- 

1 

- 

- 

15 

91 

1 

- 

— 

- 

- 

— 

— 

16 

93 

5 

— 

— 

— 

3 

- 

- 

17 

98 

2 

- 

- 

- 

2 

- 

- 

18 

105 

6 

- 

- 

- 

1 

- 

1 

19 

109 

6 

- 

- 

- 

4 

- 

- 

20 

116 

1 

- 

- 

- 

1 

- 

- 

21 

122 

2 

— 

— 

— 

2 

— 

— 

22 

125 

- 

- 

- 

- 

1 

- 

- 

23 

129 

4 

- 

- 

- 

- 

- 

- 

24 

134 

9 

- 

- 

- 

1 

- 

- 

25 

137 

8 

- 

- 

- 

- 

- 

- 

26 

142 

2 

_ 

— 

— 

2 

— 

_ 

27 

145 

1 

- 

- 

- 

2 

- 

- 

28 

151 

1 

- 

- 

- 

2 

- 

- 

29 

155 

- 

- 

- 

- 

- 

- 

- 

30 

159 

6 

- 

- 

- 

3 

— 

- 

31 

167 

1 

— 

— 

— 

— 

- 

— 

32 

169 

- 

- 

- 

- 

- 

- 

- 

33 

171 

4 

- 

- 

- 

- 

- 

- 

34 

179 

3 

- 

- 

- 

- 

- 

- 

35 

183 

24 

- 

- 

- 

16 

— 

1 

36 

200 

_ 

— 

— 

— 

- 

— 

37 

16 

19 

— 

- 

- 

5 

- 

- 

8  9  10  11  12  13  14  15  16 


— 

- 

- 

2 

2 

1 

- 

3 

2 

— 

- 

1 

- 

3 

2 

19 

4 

— 

- 

- 

- 

2 

1 

7 

4 

- 

- 

3 

1 

3 

1 

6 

1 

- 

- 

5 

- 

1 

1 

7 

2 

_ 

— 

2 

1 

2 

4 

9 

— 

— 

- 

1 

— 

— 

- 

2 

8 

- 

- 

14 

1 

4 

7 

7 

1 

- 

- 

1 

- 

11 

3 

8 

2 

- 

- 

3 

1 

13 

7 

9 

2 

_ 

_ 

1 

1 

2 

1 

3 

15 

- 

- 

3 

2 

4 

19 

19 

3 

- 

- 

4 

- 

4 

1 

5 

5 

- 

- 

10 

2 

4 

8 

18 

- 

- 

- 

1 

- 

1 

- 

1 

3 

— 

— 

8 

1 

3 

6 

11 

6 

- 

- 

7 

1 

2 

2 

12 

6 

- 

- 

3 

3 

1 

- 

2 

6 

- 

- 

2 

1 

4 

6 

10 

8 

- 

- 

2 

1 

2 

2 

2 

2 

— 

— 

3 

1 

1 

3 

1 

1 

- 

2 

1 

- 

- 

- 

- 

2 

- 

- 

4 

- 

4 

7 

9 

1 

- 

- 

8 

- 

2 

2 

5 

6 

- 

- 

2 

1 

1 

9 

5 

3 

— 

_ 

5 

1 

1 

3 

2 

4 

- 

- 

- 

2 

9 

- 

6 

6 

- 

- 

1 

- 

- 

2 

- 

2 

- 

- 

- 

- 

6 

- 

2 

9 

- 

- 

4 

1 

3 

2 

13 

2 

— 

— 

1 

_ 

4 

4 

3 

- 

- 

- 

4 

- 

1 

1 

- 

2 

- 

- 

3 

- 

4 

2 

1 

4 

- 

- 

1 

1 

1 

1 

6 

35 

- 

- 

36 

19 

15 

32 

42 

1 

— 

— 

— 

— 

1 

— 

2 

11 

- 

- 

14 

6 

20 

13 

30 
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BY 

1  2  3  4  5  6  7  8  9  1 0  1 1  12  1 3  1 4  1 5  1 6  1 7  1 8 


1 

2 

4 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

— 

— 

2 

3 

4 

13 

21 

27 

2 

7 

3 

- 

- 

- 

- 

2 

1 

- 

- 

- 

- 

- 

- 

1 

- 

- 

- 

5 

6 

37 

47 

3 

2 

7 

49 

4 

8 

51 

13 

- 

- 

- 

- 

1 

- 

- 

- 

- 

- 

- 

14 

- 

- 

- 

- 

9 

59 

8 

10 
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Deviations 

By  a  quick  scanning  with  the  eye  a  few  deviations  will 
immediately  be  noticed  in  the  Brown  Corpus  (by  genre)  and  in  the 
Reader's  Digest  (by  magazine  article),  for  example,  for  (in) 
(category  15)  and  for  (by)  (category  13).  These  are  doubtless  to 
be  credited  to  idiosyncracies  in  lexical  choice  or  topic  of  the 
text  or  genre.  A  closer  and  refined  inspection  of  such  disparities 
would  no  doubt  give  interesting  results  bearing  on  braille  usage  in 
relation  to  English  discourse  and  style. 

Proportionate  Occurrence 

In  the  next  tables  occurrences  of  the  lower-sign  sequences  are 
summarized  according  to  the  types  of  text  in  which  they  appear. 
The  letters  in  the  first  column  are  those  which  identify  the 
classifications  of  material  in  the  Brown  Corpus  based  on  subject 
and  style.  The  percentage  distribution  across  the  18  categories  of 
braille  representation  is  given  for  each  of  the  lower-sign 
sequences  in  each  of  the  text  types  and  in  the  Brown  Corpus  samples 
as  a  whole.  By  way  of  comparison,  for  each 

lower-sign  sequence,  total  and  percentage  figures  are  given  for  the 
sequence's  occurrences  in  the  Reader's  Digest  text. 
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One  "0"  occurrence  in  category  5  is  also  in  category  7. 

One  "L"  occurrence  in  category  5  is  also  in  category  7. 

Two  "Rd  Dg  occurrences  in  category  5  are  alBO  in  category 
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D"  occurrence  in  category  6  la  also  in  category  7. 

J"  occurrence  in  category  6  la  alao  in  category  7. 

Rd  Dg”  occurrence)  in  category  8  are  alao  in  category 
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One  "Rd  Dr"  occurrence  In  category  15  la  also  in  category  17, 
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One  "A"  occurrence  in  category  6  is  alao  in  category  7. 

One  "C”  occurrence  in  category  0  is  alao  in  category  7. 

Six  ”J"  occurrences  in  category  0  are  alao  in  category  7. 
Two  "Rd  Dg '  occurrences  In  category  0  are  alao  in  category 
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All  occurrences  In  category  15  are  also  in  category  17. 
All  occurrences  in  category  16  are  also  in  category  18. 
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The  foregoing  tabulations  give  totals,  for  each  Brown  Corpus 
genre  of  our  sample,  of  the  incidence  of  each  lower-sign  sequence 
in  all  18  of  its  possible  manifestations.  These  totals  vary  in 
interesting  ways  by  genre,  and  the  category  percentage  is  seen  to 
vary  considerably  also.  So,  for  (be)  category  1  varies  from  5.41 
to  33.33%,  and  the  sparser  category  5  varies  from  1  to  13%,  while 
category  12  hovers  between  1  and  10%;  yet  the  more  populous 
category  14  shows  a  more  stable  range  of  3.51  to  20.19%.  However, 
it  will  be  seen  by  closer  inspection  that  stability  does  not  go 
simply  together  with  frequency.  The  differences  in  these  ranges  of 
variation  must  have  to  do  with  contrasts  elicited  by  the  rules  and 
constraints  of  genre  and  with  the  content  of  English  linguistic 
form.  Remember  that  the  structure  and  rules  of  braille  itself 
govern  the  very  appearance  of  any  one  of  categories  1-18. 

This  type  of  variation  is  rather  less  rich  for  (to)  and  (in)  . 
In  the  sparser  items  such  as  (by),  (enough) ,  (into),  and  (were)  we 
find  far  less  in  this  dimension  to  work  with.  These  superficial 
remarks  are  not  intended  to  deal  adequately  in  any  way  with  the 
aspect  of  genre,  category,  and  linguistic  formational  variation, 
but  only  to  call  attention  to  this  fascinating  textual  parameter 
and  to  its  clear  manifestation  in  our  modest  data.  What  is  needed 
is  the  collection  and  analysis  of  a  much  larger  body  of  such 
contrastive  data,  and  a  long  period  of  study  which  will  surely  grow 
out  of  such  a  probe. 
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Patterns  of  Incidence 

If  we  now  disregard  this  category  variation  and  concentrate  on 
the  totals  of  category  incidence  for  each  lower-sign  sequence,  let 
us  get  a  better  visual  grasp  of  the  incidence  of  these  categories 
in  our  sample  corpus.  To  do  this  we  will  translate  the  sequence  of 
category  totals  given  for  each  lower-sign  sequence  into  a  template 
of  line  graphs.  The  problem  of  estimating  the  relations  in 
behavior  of  these  lower-sign  sequences  then  becomes  a  simple  one  of 
visual  template  fitting,  which  seems  to  suffice  for  our  immediate 
purpose . 
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In  a  visual  inspection  of  these  displays  it  may  be  useful  to 
recall  the  frequency  groupings: 


frequency 

rarest 

rare 

median 

most  frequent 


rough  congruence 
(enough) , (were)  (into) 
(by) 

(was),  (his) 

(be),  (to),  (in) 


This  may  provide  a  useful  association  for  purposes  of  teaching 
and  mastery  of  these  various  manifestations  and  functions  of  lower- 
sign  sequences;  but  the  specifics  must  be  discussed  separately. 
Braille  and  Inkprint 

We  now  proceed  to  study  the  incidence  of  lower-sign  braille 
units  in  relation  to  print  letter  sequences. 

Let  us  view  the  data  which  have  been  presented  in  detail  in  a 
more  summarizing  fashion  by  displaying  the  braille  disposition  of 
the  print  letter  sequences  corresponding  to  our  lower-sign  braille 
units . 

First,  some  notes  recalling  the  representation  in  braille  of 
print  occurrences  of  the  letter  sequences  which  correspond  to  the 
lower-sign  words.  The  table  indicates  the  ways  in  which  the  letter 
series  in  the  text  samples  are  transcribed  into  braille. 

The  text  sources,  selections  from  the  Brown  Corpus  and  an  issue  of 
the  Reader's  Digest,  are  specified  by  "BC"  and  "RD." 

In  the  first  numerical  column  there  is  a  count  of  the 
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occurrences  of  the  letter  sequence. 

The  second  column  gives  the  count  of  the  lower  sign  braille 
contractions . 

Sometimes  the  letter  sequences  under  consideration  occur 
within  a  longer  sequence  which  has  its  own  braille  contraction,  as 
(be)  in  (before),  in  in  (blind),  or  to  in  (together)  .  The  count  of 
such  inclusive  contractions  is  given  in  the  third  column,  and  it 
should  be  noted  that  the  count  of  this  type  of  sequence  is  not 
included  in  the  second  column. 

The  fourth  column  of  figures  shows  the  number  of  times  the 
sequence  is  represented  by  being  partly  contracted.  This  figure  is 
independent  of  the  two  prior  columns. 

In  some  cases  the  first  or  last  letter  of  a  series  is  combined 
with  an  adjacent  letter  or  group  of  letters  to  form  a  contraction 
which  overlaps  the  lower  sign  series.  Examples  are  the  overlap  of: 
by  by  (bb)  in  lo(bb)y,  in  by  (ness)  in  busi(ness),  to  by  (st)  and 
(one)  in  (st) (one) ,  and  was  by  (sh)  in  wa(sh).  The  count  of  this 
type  of  occurrence  is  in  column  five. 

The  last  column  shows  how  many  times  the  letter  sequence  is 
found  uncontracted,  that  is,  represented  letter  for  letter  in 
braille . 


print 

lowe  r 

inclu¬ 

over¬ 

letter 

sign 

sive 

partial 

lapping 

no 

se¬ 

con- 

con¬ 

con¬ 

con¬ 

con¬ 

quence 

t  rac  t ion 

traction 

traction 

traction 

traction 

BE 

BC 

1126 

362 

154 

247 

363 

32  .  15 

13 . 68 

21.94 

32 . 24 

RD 

1184 

362 

169 

287 

366 

30 . 57 

14.27 

24 . 24 

30.91 

BY 

BC 

287 

242 

2 

43 

84 . 32 

.  70 

14 . 98 

RD 

361 

270 

91 

74 . 79 

25.21 

ENOUGH 

BC 

27 

19 

8 

70.37 

29 . 63 

RD 

18 

15 

3 

83.33 

16.67 

HIS 

BC 

650 

310 

280 

38 

22 

47 . 69 

43 . 08 

5 . 85 

3 . 38 

RD 

500 

287 

165 

26 

22 

57 . 40 

33 . 00 

5 . 20 

4.40 

IN 

BC 

4429 

2813 

1533 

66 

17 

63 .51 

34.61 

1.49 

.  38 

RD 

5387 

2939 

2393 

26 

29 

54 .56 

44.42 

.  48 

.  54 

INTO 

BC 

92 

86 

6 

93.48 

6 .52 

RD 

156 

144 

12 

1* 

92.31 

7.69 

.64 

TO 

BC 

1948 

1141 

125 

191 

491 

58.57 

6.42 

9.80 

25.21 

RD 

2692 

1476 

179 

278 

759 

54 .83 

6 . 65 

10 . 33 

28  .  19 

WAS 

BC 

495 

451 

30 

14 

91.11 

6.06 

2 .83 

RD 

814 

748 

33 

33 

91.89 

4 .05 

4.05 

WERE 

BC 

135 

124 

11 

8* 

91.85 

8 . 15 

5 . 93 

RD 

228 

204 

24 

22* 

89 .47 

10 .53 

9.65 

*  This  count  is  included  in  the  "partial  contraction"  count 

on  the  same  line.  Braille  representation  in  these  occurrences 
involves  both  contraction  overlap  and  partial  contraction. 
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In  partly  analogous  fashion  we  now  inspect  the  incidence  of 
"whole  words"  and  these  lower-sign  braille  units.  The  braille 
disposition  of  the  lower-sign  words  occurring  as  whole  words  in  the 
print  text  is  shown  in  the  following  table.  To  render  the  results 
comparable  for  purposes  of  this  and  the  following  tabulations  the 
Brown  Corpus  interpretation  of  "whole  word"  has  been  accepted 
throughout . 
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Representation  by  Lower-Sign  Units  and  Other  Braille  Units 
of  Lower-Sign  Letter  Sequences  Occurring  In  Print  as  Whole  Words 


print 

whole 

wo  rd 

BE  Brown  Corpus  306 

Reader’s  Digest  224 

BY  Brown  Corpus  270 

Reader’s  Digest  306 

ENOUGH  Brown  Corpus  27 

Reader’s  Digest  18 

HIS  Brown  Corpus  331 

Reader’s  Digest  308 

IN  Brown  Corpus  1043 

Reader’s  Digest  1070 

INTO  Brown  Corpus  89 

Reader’s  Digest  151 

TO  Brown  Corpus  1264 

Reader’s  Digest  1597 

WAS  Brown  Corpus  456 

Reader’s  Digest  762 

Brown  Corpus  127 

Reader’s  Digest  206 


con¬ 

partly- 

con¬ 

uncon¬ 

tracted 

tracted 

tracted 

225 

81 

73.53 

26 .47 

152 

72 

67 . 86 

32 . 14 

242 

28 

89.63 

10 . 37 

270 

36 

88 . 24 

11 . 76 

19 

8 

70 .37 

29.63 

15 

3 

83 . 33 

16.67 

310 

21 

93 . 66 

6.34 

287 

21 

93 . 18 

6 .82 

1029 

14 

98 . 66 

1 . 34 

1052 

18 

98 . 32 

1.68 

86 

3 

96 . 63 

3 . 37 

144 

7 

95 . 36 

4.64 

1141 

123 

90 . 27 

9 . 73 

1476 

121 

92 . 42 

7 . 58 

451 

5 

98 . 90 

1  .  10 

748 

14 

98  .  16 

1 . 84 

124 

3 

97 . 64 

2 . 36 

204 

2 

99 .03 

.  97 

WERE 
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Relative  Frequency 

Now  that  we  have  inspected  in  detail  some  important  relations 
between  the  occurrence  and  non-occurrence  of  these  lower-sign 
braille  units  in  relation  to  relevant  print  letter  sequences,  it 
might  be  of  interest  in  retrospect  to  summarize  the  frequencies  of 
these  units  in  the  text  studied. 

The  occurrences  of  the  lower-sign,  one-space  (or,  in  the  case 
of  (into),  two-space)  contractions  are  ranked  by  frequency  in  the 
following  table. 


Lower-Sign  Word  Contractions 
Brown  Corpus  Reader's 

Sample  Digest 


IN 

2813 

50.7 

2939 

45.6 

TO 

1141 

20.6 

1476 

22.9 

WAS 

451 

8.1 

748 

11.6 

BE 

362 

6.5 

362 

5.6 

HIS 

310 

5.6 

287 

4.5 

BY 

242 

4.4 

270 

4.2 

WERE 

124 

2.2 

204 

3.2 

INTO 

86 

1.6 

144 

2.2 

ENOUGH 

19 

0.3 

15 

0.2 

5548 

Sample  Consistency 

At  the  same  time 

we  have 

6445 

seen  in  the  foregoing  table  the 

gratifying 

agreement  in 

overall 

incidence 

of  these  nine  braille 

shapes  between  the  Brown  Corpus  sample  and  the  sample  from  the 
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Reader's  Digest.  Comment  on  the  small  deviations  between  these  may 
be  reserved  for  another  occasion. 

As  a  test  of  consistency  in  the  translation,  identification, 
and  categorized  collection  of  data  in  this  study,  we  offer  the 
following  comparison  of  the  constituency  of  our  data  for  lower-sign 
sequences  occurring  as  whole  words  with  that  published  for  the 
Brown  Corpus  (KuCera  &  Francis,  1967). 

To  render  the  results  comparable  for  purposes  of  this 
tabulation  the  Brown  Corpus  interpretation  of  "whole  word"  has  been 
accepted.  For  the  same  reason  the  data  from  our  sample  and  from 
the  Reader's  Digest  result  in  every  case  from  summing  columns  1 
through  7  of  the  18  displayed  in  our  categorization. 


Brown  Corpus  % 

Brown  Corpus  Sample 

Reader's 

Digest 

TO 

26149 

32.1 

1264 

32.3 

1597 

34.4 

IN 

21341 

26.2 

1043 

26.7 

1070 

23.1 

WAS 

9816 

12.0 

456 

11.7 

762 

16.4 

HIS 

6997 

8.6 

331 

8.5 

308 

6.6 

BE 

6377 

7.8 

306 

7.8 

224 

4.8 

BY 

5305 

6.5 

270 

6.9 

306 

6.6 

WERE 

3284 

4.0 

127 

3.2 

206 

4.4 

INTO 

1791 

2.2 

89 

2.3 

151 

3.3 

ENOUGH 

430 

0.5 

27 

0.7 

18 

0.4 

TOTAL 

81490 

3913 

4642 
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It  will  be  seen  that  except  for  the  three  least  frequent  the 
deviation  among  these  percentages  is  very  small.  The  greater 
deviation  for  the  Reader's  Digest  material  is  easily  understood  as 
reflecting  the  different  and  less  systematic  genre  representation 
among  its  texts. 

It  might  be  of  interest  to  readers  to  compare  the  decrement  in 
incidence  of  these  lower-sign  words  in  the  Brown  Corpus  with  that 
displayed  in  the  consolidated  Lorge-Thorndike  counts  (Thorndike  and 
Lorge,  1944)  for  these  words  individually. 


1 

2 

3 

4 

Rounded 

Brown 

Lorge 

L-Th 

Average 

Corpus 

Magazine 

Semantic 

(2  &  3) 

TO 

26149 

115358 

?* 

115M 

IN 

21341 

75253 

96674 

86M 

WAS 

9816 

58732 

42552 

5 1M 

HIS 

6997 

30748 

32140 

3 1M 

BE 

6377 

19645 

?* 

20M 

BY 

5305 

11454 

29130 

20M 

WERE 

3284 

15082 

16340 

16M 

INTO 

1791 

9231 

7016 

8M 

ENOUGH 

430 

2113 

892 

2M 

♦This  means 

that  data 

were  not  available  from  this 

count.  In  the 

Preface  to 

Thorndike 

and  Lorge  (1944) 

we  read  that 

"the  recorders 

of  the  semantic  count  .  .  .  did  not  separate  different  forms  of 

words  like  be,  come,  and  do  .  .  .". 
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All  these  words  are  among  the  first  500  words  and  except  for 
enough  are  registered  as  totalling  800  times  or  more  in  the 
Thorndike  1931  data  and  1000  times  or  more  in  the  count  of  120 
juvenile  books.  In  the  case  of  enough  the  count  was  800  or  more 
for  the  1931  data,  but  only  estimated  at  over  1000  for  the  juvenile 
books.  All  these  words  rank  at  100  or  over  per  million. 

The  congruence  in  the  decrement  profile  for  these  words  in 
these  corpus  counts  is  reassuring. 

Some  Further  Conclusions 

The  guestion  may  be  asked  how  salient  or  important  the  lower- 
sign  words  or  braille  units  are  in  the  braille  code.  Such  a 
guestion  deserves  a  carefully  considered  answer,  but  there  are 
different  ways  of  answering  such  a  question.  One  may,  for  example, 
consider  such  a  question  from  the  point  of  view  of  the  overall 
efficiency  of  the  code,  or  as  a  function  leading  to  the  learning 
and  mastery  of  other  elements  or  aspects  of  the  code.  Such 
considerations  need  not  take  us  up  and  detain  us  now,  but  can 
certainly  be  usefully  addressed  at  a  later  time. 

For  the  present  we  shall  content  ourselves  with  a  simple 
quantitative  observation.  Our  Brown  Corpus  sample  comprises  50,000 
words  of  American  English  of  which  3,913  are  whole  words  of  the 
lower-sign  set  discussed  in  this  paper;  this  means  that  the  total 
lower-sign  words  comprise  7.8%  of  the  sample  text,  and  that  ca. 
.57%  are  lower-sign  words  which  are  uncontracted  (i.e,  not 
represented  by  lower-sign  braille  units  for  contextual  reasons). 
These  figures  compare  favorably  with  those  for  the  full  Brown 
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Corpus,  which  comprises  1,000,000  words  of  which  81,490  are  whole 
words  of  our  lower-sign  set,  or  8%  of  the  total  text. 

We  thus  arrive  at  a  quantitative  statement  of  the  proportional 
occurrence  of  the  objects  of  our  study.  Out  of  50,000  words  7.25% 
(=  7.8%  less  .57%),  or  3,627  words,  are  represented  by  lower-sign 
braille  units.  Out  of  180,000  (actually  179,429)  braille  units 
comprising  our  sample  text  3%  (5,548)  are  lower-sign  braille  units. 
In  other  words,  one  in  every  32  braille  units  of  text  will  be  a 
lower  sign;  one  may  expect  to  encounter  on  the  average  a  lower-sign 
braille  unit  in  almost  every  line  of  braille. 

Given  the  textual  frequency  which  we  have  observed  for  these 
lower-sign  braille  units,  and  consequently  for  the  lower-sign 
shapes,  as  a  group,  it  is  clear  that  for  any  learner  of  braille  the 
mastery  of  these  braille  units  and  shapes  forms  a  considerable 
portion  of  the  total  learning  task.  Far  from  being  just  one  small 
set  of  signs,  with  their  own  peculiar  structure  and  habits,  among 
the  total  inventory  of  the  braille  code,  these  units  and  shapes 
furnish  a  predictable  feature  of  nearly  every  line  of  braille 
encountered  by  a  fluent  reader.  Any  successful  program  of  teaching 
must  address  itself  to  this  task. 

Even  a  superficial  glance  at  the  relative  frequencies  of  the 
lower-sign  shapes  (contractions)  gives  us  a  strong  indication  of 
important  parameters.  (in)  and  (to)  must  clearly  be  made  priority 
items  in  the  braille  learning  and  mastery  task.  Note  that  the 
importance  of  (in)  (and  (be))  is  greatly  enhanced  by  participation 
in  the  constitution  of  larger  words;  the  role  of  (in)  as  a 
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productive  prefix  shape  in  English  further  enhances  its  frequency 
and  salience.  The  great  dominance  of  (was)  over  (were)  is  notable; 
this  is  easily  explained  by  the  fact  that  in  English  the  plural  is 
the  marked  term  in  relation  to  the  singular,  and  therefore  may  be 
expected  to  be,  as  a  complex  rather  than  a  simple  term,  less 
frequent  than  the  singular.  In  all  this  list  (be),  (his),  (by), 
may  be  regarded  as  the  median  items,  for  which  no  special  lesson 
may  immediately  be  drawn. 

(into)  is  in  a  class  by  itself,  and  as  a  complex  in  its 
morphology  may  be  expected,  like  (were)  ,  to  show  a  depressed 
frequency.  Relevant  experts  and  users  might  wish  to  consider 
redefining  (into)  as  a  sequence  of  its  components,  and  thereby 
remove  it  as  an  apparent  additional  item  in  this  inventory  of 
special  behavior. 

(enough)  is  clearly  the  least  urgent  task  among  all  of  these 
in  respect  of  priority  for  mastery.  Responsible  experts  might 
profitably  consider  whether  this  contraction  really  pays  its  way  in 
the  braille  code.  Is  its  yield  worth  the  investment  of  learning? 


See  English  Braille  American  Edition  1959  (Louisville:  American 
Printing  House  for  the  Blind,  1970)  Rule  XIII  and  S39;  S.  C. 

Ashcroft  (1960);  E.  J.  Rex  (1970);  Lorimer,  Tobin,  Gill,  &  Donee 
(1982)  . 

It  is  useful  to  make  clear  at  the  outset  that  in  this 
discussion  only  those  lower-sign  braille  units  that  represent  whole 
English  words  (i.e.  which  include  logograms  with  downshifted 
alphabetic  shapes)  are  analyzed  and  studied.  That  is  to  say  lower- 
sign  non-words  (i.e.  morphograms  and  phonograms  with  downshifted 
alphabetic  shapes,  e.g.  dis-)  are  not  included  in  this  study.  Of 
course,  it  is  not  implied  that  the  latter  are  without  interest  and 
importance,  even  relation  to  our  present  subject  and  class.  They 
are  simply  a  topic  for  another  day,  and  should  not  distract  and 
delay  us  from  the  present  self-evident  subject. 

2It  is  in  this  spirit  and  on  this  basis  that  Caton,  Pester,  and 
Bradley  (1980)  was  constructed.  As  our  studies  continue  and 
findings  are  refined  we  must  hope  to  revise  that  work  and  other 
like  instructional  materials,  and  to  reduce  their  inadequacies. 

3 American  Association  of  Workers  for  the  Blind  (1913);  Report  of 
the  Uniform  Type  Committee.  The  Outlook  for  the  Blind ,  7  (1913): 
1-48.  Ashcroft  (1960);  Nolan  and  Kederis  (1969,  p.  27-28,  37,  47, 
65-66;  p.  87-94  concentrates  on  upper-dot  cells. 

4In  the  terms  of  Hamp  and  Caton  (1984)  represented  as  a  logogram. 
5In  the  terms  of  Hamp  and  Caton  (1984)  represented  in  phonograms . 


6In  the  terms  of  Hamp  and  Caton  (1984)  represented  in  braille 
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alphabetic  letters. 

7Either  as  phonograms ,  morphograms ,  or  not,  in  the  terms  of  Hamp 
and  Caton  ( 1984  )  . 

sThat  is,  the  component  sequence  may  appear  with  its  logographic 
value  or  simply  with  the  value  of  a  phonogram  corresponding  to  a 
print  sequence. 
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Summary 


Study  I.  A  Fresh  Look  at  the  Sign  System  of  the  Braille  Code 

The  internal  analysis  as  described  in  the  1984  paper 
aimed  at  devising  new  descriptions  of  the  elements  of  the 
braille  code.  This  was  done  to  eliminate  the  many  conflicting 
and  confusing  terms  and  categories  previously  used  by  teachers 
and  students,  and  to  provide  them  with  a  new  system  consisting 
of  a  relatively  small  number  of  categories,  or  groups,  and 
clear,  precise,  linguistically  based  descriptions  of  all  the 
elements  of  English  braille.  Grade  2. 

In  addition,  the  discussion  is  intended  to  clarify  the 
internal  characteristics  of  the  code. 

We  believe  that  the  results  of  this  analysis  can  be  used 

to: 

1.  Provide  teachers  with  an  outline  of  braille  terms 
that  will  enable  them  to  describe  and  discuss  any 
element  of  the  braille  code  in  a  manner  easily 
understood  by  children. 

2.  Emphasize  that  the  teaching  of  braille  reading  and 
print  reading  are  not  analogous,  and  promote  an 
understanding  of  the  internal  characteristics  of 
the  braille  code,  thus  providing  teachers  with  more 
effective  strategies  for  teaching  reading  to 
children  who  use  braille  as  their  primary  medium. 

3.  Form  a  basis  for  any  further  analysis  or  productive 
understanding  of  the  function  built  into  the 
structure  of  the  English  braille  code. 


Study  II.  Toward  A  Refinement  of  the  Linguistic  Analysis  of 
American  Literary  Braille,  Grade  2 

1.  A  plan  is  outlined  to  continue  and  refine  the  1984 
analysis,  and  has  been  pursued  since  1989. 

2.  The  conceptual  foundations  of  the  analysis  are  placed  in 
the  context  of  20th  century  linguistic  theory  in  brief 
and  summary  form. 

3.  To  fulfil  the  first  requisite  a  suitable  text  corpus  must 
serve  as  an  empirical  data  base.  The  creation  of  such  a 
corpus  extracted  from  the  "Brown  Corpus"  is  described. 

4.  As  a  sample  result,  classed  totals  of  braille  units  (cf. 
the  first  study)  contained  in  the  sample  corpus  are 
presented. 

Study  III.  An  Analysis  of  Braille  Word  Lengths 

1.  The  lengths  of  all  words  in  our  sample  of  the  Brown 
Corpus  (cf.  second  study  above),  with  the  exception  of 
certain  categories  of  text  spans  (cf.  Introduction),  were 
counted  in  terms  of  print  characters,  braille  units,  and 
braille  shapes,  each  word  being  given  a  characterization 
coding  these  three  counts.  The  frequency  of  each  of 
these  codings  and  their  constituents  could  then  be 
counted . 

2.  Short  words  are  most  frequent,  but  more  so  in  braille 
than  in  inkprint;  63%  of  braille,  and  43%  of  print  words, 
have  three  or  fewer  units,  or  letters. 

Nearly  20%  of  all  words  contain  three  print  characters. 


3. 


yet  but  one  braille  shape. 

4.  Words  with  no  braille  contractions  comprise  ca.  one- 
fourth  of  the  total. 

5 .  More  than  one-half  of  the  words  show  one  or  two  fewer 
shapes  in  braille  than  characters  in  print. 

6 .  Our  computations  confirm  the  saving  in  lengths  through 

braille  contractions,  but  also  indicate  a  general 
correspondence  between  braille  and  print  lengths  and  a 
close  correspondence  between  braille  unit  and  shape 
counts.  Tabulations  bring  out  deviations  and 

correspondences  of  detail. 

7.  These  computations  permit  the  correction  of  exaggerated 
claims  or  impressions. 

8.  Accurate  data  on  English  word  lengths  yield  implications 
for  learning  and  education  and  for  our  understanding  and 
criticism  of  the  design  of  braille. 

9  .  For  a  language  such  as  English  our  ability  to  supply  this 
measure  means  a  clue  to  a  useful  metric  of  English 
discourse  and  genre. 

Study  IV.  The  Text  Frequency  and  Incidence  of  Lower-Sign 
Sequences  in  American  Braille 

1.  The  data  for  study  of  the  nine  lower-sign  sequences  have 
been  taken  exhaustively  from  our  sample  of  the  Brown 
Corpus  (cf .  second  study,  above)  and  from  a  recent  number 
of  the  Reader's  Digest.  Both  logogram  (contraction)  and 
spelt-out  representations  have  been  studied,  and  these 


occurring  as  inkprint  words  or  enclosed  in  larger  words 
or  other  logograms.  Eighteen  categories  of 

representation  are  discriminated.  All  counts  are 
subdivided  by  Brown  Corpus  genres;  this  gives  the 
possibility  of  study  by  discourse  and  stylistic  type. 
Observations  are  then  made  on  the  numerical  properties  of 
these  counts,  and  visual  line  graphs  are  presented. 

Some  general  results  emerge  for  priorities  in  teaching 
and  for  relative  quantity  of  work  contributed  to  braille 
text  among  these  lower-sign  words.  The  Reader's  Digest 
sample  provides  a  control  on  the  Brown  Corpus  count  for 
overall  incidence. 

The  decrement  in  incidence  of  these  words  is  also 
compared  with  that  found  in  Lorge-Thorndike  1944. 

The  question  of  the  importance  or  salience  of  these  words 
is  both  broached  here  and  reserved  for  the  future. 


